February 24, 2007
Copyrighting Databases
Here is a brief summary of the discussion about a GNU license for databases/dictionaries, and a primarily conclusion out of it.
| “You cannot copyright databases in the US AFAIK. There was a case about a phone dictionary” | “why would we regard some dictionaries' definitions as better than others? There is not a single, correct definition of any English word”. |
| “A dictionary requires as much work as a phone book and isn't a very creative process” | “The amount of work isn't important, it's about the creativity. Writing all those definitions in the dictionary requires creativity, so you get copyright on the dictionary”. |
| “you cannot copyright the name + number in that phone book, since that is considered a ‘fact’.” | |
| “a list (database) of genomes for a bunch of species isn't copyrightable either” |
Yes, phone numbers, contact info, genomes, are definitely facts. In this case, I don’t claim the its content is subject to copyright, but maybe its design.
By contrast, natural languages, so far, don't have a structured architecture. Most people believe that its impossible or very difficult to put them in a structured shape.
So, here, in this case, in a dictionary case, when you build a bilingual dictionary - not just a wordlist -, in fact, you're trying to convert something unstructured to a structured thing. You're trying to (create) and (innovate) 'structuring standards', and structuring contents according to these standards.
So, the procedures and the activities that make databases subject to copyright are:
- Normalization: Designing UML/ERD.
- Structuring something unstructured: Word definitions are not “Facts”.
Categories: Business, Internationalization




