|
Jumping right away into an example: (fixed pitch font recommended) English French I'm in England Je suis en Angleterre I'm in France Je suis en France I'm in Italy Je suis en Italie I'm in London Je suis à Londres I'm in Paris Je suis à Paris I'm in Milano Je suis à Milan Tell an AI that England, France and Italy are countries, that London, Paris and Roma are towns, then feed it with the above examples. If the AI is capable of comparison and generalization, it will immediately notice an asymmetric use of the prepositions. Being in a country and being in a town are not totally comparable in the details, but a language needn't express this difference, it is implied by the category of the toponym. So does English. I wonder where and how the ways differ in the pipelines from a rich mental representation down to an intentionally sketchy language. It is agreed that we do *not* want to hard-code anything in our AI, it must learn the different languages by itself, by the organization of its own mental structures. If the AI has a general purpose to-be-located-in/at kind of de_script_or, then the pipeline is straightforward into English. But into French it splits into two cases: to-be-located-in region/country/continent I'm in Provence Je suis en Provence I'm in Spain Je suis en Espagne I'm in Europe Je suis en Europe to-be-located-in room/building/district/town I'm in the kitchen Je suis à la cuisine I'm in the movie theater Je suis au cinéma [au = à le] I'm in the Quartier Latin Je suis au Quartier Latin I'm in Lyon Je suis à Lyon The things get trickier when you know that dans can be used instead of à with a different meaning: Il est au musée He's in the museum (to watch the things) Il est dans le musée He's inside the museum building (possibly by night to steal something) Il est à la cuisine He's in the kitchen (to cook the food) Il est dans la cuisine He's in the kitchen (possibly experimenting alchemy in the sink) So the to-be-in de_script_or isn't enough to generate French sentences, the context is often important for it will entangle itself along the pipelines. Err.. I'm realizing I may discourage everybody from studying French... So I will give other tricky examples that occur in English: He's at school (to learn) He's in the school (to do the plumbing or something) He's at sea (to work or enjoy) He's on the sea (and not on the ground) He's at war (to fight) He's (caught) in the war (caught in the troubles) These in French are: Il est à l'école Il est dans l'école Il est en mer Il est sur la mer Il est à la guerre Il est (pris) dans la guerre Compare the en in en mer with the en for geographical areas. All these things seem disconcerting but I don't believe our minds are so twisted. There is some logic behind all that, and we are to catch it if we want to make straightforward, efficient, supple, independant AIs. Within a context, an AI can learn to tell these hues apart on its own, but it won't with isolated sentences. Hm, I will stop here for today. If I'm rambling, please tell me. If I'm reinventing the wheel, please tell me too, I wish these things are already implemented in existing systems I'm not aware of. Thanks for reading. And thanks in advance for criticizing. ~leo
|