A: Only at the start of a sentence. The command name is all lower case so it seems silly to capitalize the project name, but not capitalizing the start of sentences is awkward, so... compromise. (It is _not_ "ToyBox".)
A: Toybox started back in 2006 when I handed off BusyBox maintainership and started over from scratch on a new codebase after a protracted licensing argument took all the fun out of working on BusyBox.
Toybox was just a personal project until it got relaunched in November 2011 with a new goal to make Android self-hosting. This involved me relicensing my own code, which made people who had never used or participated in the project loudly angry. The switch came after a lot of thinking about licenses and the transition to smartphones, which led to a 2013 talk laying out a strategy to make Android self-hosting using toybox. This helped bring it to Android's attention, and they merged it into Android M.
The answer to the second question is "licensing". BusyBox predates Android by almost a decade but Android still doesn't ship with it because GPLv3 came out around the same time Android did and caused many people to throw out the GPLv2 baby with the GPLv3 bathwater. Android explicitly discourages use of GPL and LGPL licenses in its products, and has gradually reimplemented historical GPL components such as its bluetooth stack under the Apache license. Similarly, Apple froze xcode at the last GPLv2 releases (GCC 4.2.1 with binutils 2.17) for over 5 years while it sponsored the development of new projects (clang/llvm/lld) to replace them, implemented its SMB server from scratch to replace samba, and so on. Toybox itself exists because somebody with in a legacy position just wouldn't shut up about GPLv3, otherwise I would probably still happily be maintaining BusyBox. (For more on how I wound up working on busybox in the first place, see here.)
A: Our longstanding rule of thumb is to try to run and build on hardware and distributions released up to 7 years ago, and feel ok dropping support for stuff older than that. (This is a little longer than Ubuntu's Long Term Support, but not by much.)
If a kernel or libc feature is less than 7 years old, I try to have a build-time configure test for it and let the functionality cleanly drop out. I also keep old Ubuntu images around in VMs and perform the occasional defconfig build there to see what breaks. (I'm not perfect about this, but I accept bug reports.)
My original theory was "4 to 5 18-month cycles of moore's law should cover the vast majority of the installed base of PC hardware", loosely based on some research I did back in 2003 and updated in 2006 which said that low end systems were 2 iterations of moore's law below the high end systems, and that another 2-3 iterations should cover the useful lifetime of most systems no longer being sold but still in use and potentially being upgraded to new software releases.
It turns out I missed industry changes in the 1990's that stretched the gap from low end to high end from 2 cycles to 4 cycles (here's my writeup on that; and _that_ analysis ignored the switch from PC to smartphone cutting off the R&D air supply of the laptop market. Meanwhile the Moore's Law s-curve started bending down in 2000 and these days is pretty flat: the drive for faster clock speeds stumbled then died, the subsequent drive to go wide maxed out around 4x SMP with ~2 megabyte caches for most applications. These days the switch from exponential to linear growth in hardware capabilities is common knowledge.)
But the 7 year rule of thumb stuck around anyway: if a kernel or libc feature is less than 7 years old, I try to have a build-time configure test for it and let the functionality cleanly drop out. I also keep old Ubuntu images around in VMs and perform the occasional defconfig build there to see what breaks.