Keyboard handling on X11 host systems
VirtualBox handles keyboard input rather differently to most X11 applications, because guest operating systems expect to read key numbers from a physical keyboard, which tell it about the key position on the keyboard rather than the symbol on the key. For instance, "," on a US keyboard and "ъ" on a Russian keyboard produce the same number, and the operating system has no way, without being told, of knowing which of the two layouts is being used. (You are quite likely familiar with setting the keyboard layout on your physical system unless you are from the US. The same thing applies to virtual systems.)
VirtualBox does not normally look at the key symbols of key events which it receives from the host operating system at all (there is code in there to look at them, but that is mainly legacy code which we have not yet got rid of; we may even have done that by the time you read this). Instead, it looks at the key code, which corresponds to the actual physical key which was pressed. It uses a function of the X server called XKB to find out which key on a PC keyboard the number corresponds to, and pretends to the guest that that key on the simulated virtual keyboard was pressed. So when you hold down shift and press "A", the guest sees the numbers for "shift pressed", "A pressed" (the actual number depends on where on your particular keyboard the letter A is located), "A released" and "shift released". It does not get a number corresponding to capital A, and if the layout in the guest is not set up correctly (e.g. you have a French keyboard but the guest thinks it is a US one), software in the guest will usually see a capital "Q". In fact, if you have the layout set up correctly in the guest but wrongly on the host, applications in the guest will normally actually get the right letters and symbols.
This all works fine as long as the keyboard on your host system roughly matches a PC keyboard. Since this is the vast majority of our use cases we do not currently attempt to handle other cases. (The legacy code mentioned above does, but not well enough to be worth discussing.) One use case which does not work well is remote VNC sessions, because even though the keyboard is generally a PC one, the VNC server does not usually provide enough information to VirtualBox to be able to work out which physical keys were pressed. We can think of a few possible solutions to the problem, but none that we find really convincing, and in addition, as the number of users affected are rather low we would rather concentrate our efforts elsewhere.
We would of course accept sensible code contributions to make other use cases work better. A couple of things to consider are that the code should work well and be well tested, it should not cause regressions for other use cases (we would expect the submitter to convince us that they have tested this well) and it should not make our existing code harder to understand. We would probably not be able to accept donations in return for writing the code due to lack of available developer time. If you feel that way inclined you would probably be better off finding a third party developer who would be willing take the donations and write and submit the code. They should probably talk to us before starting and co-ordinate the work with us to make the process work well and avoid wasted effort.