Would comparing all of those files against their unencrypted backups in addition to the other cracking algorithms help discover the key?
Once activated, the malware will attempt to contact its command-and-control network and either use a compiled-in public key (of which the CCC has the public one) or generate, as securely as possible, a public/private key pair, send the private key to the CCC, and delete its local copy.
Now the malware has a public key. It then either generates a single crypto key, or one crypto key for every file it attacks, and encrypts the crypto key with the public key. This way it can use a symmetric algorithm, very fast, to encrypt the file, while keeping the decryption key protected by a safer, albeit much slower, asymmetric algorithm.
The malware now creates an encrypted copy of the target file, then does its best to destroy all copies of the original (e.g. shadow copies). Finally renames the encrypted copy with the same name of the original, plus some extension.
Without ransoming back the private key and excluding errors on the programmers' part (e.g. they did not securely delete the private key locally, and it can be recovered), there is no chance of getting the symmetric key.
It is still possible to try and force the symmetric part of the encryption, knowing what the decrypted text looks like and the structure of the encrypted file (something like [32 BYTES SIGNATURE][4K OF ASYM-ENCRYPTED KEY][SYM-ENCRYPTED DATA]). This is where the KPA might come in. But this requires the symmetric encryption to be KPA vulnerable.
Yes... and no. What you propose would be a "known plaintext attack" (KPA).
But even if all the files were encrypted with the same key (which is not at all a given), the time required for a successful attack against a strong, properly implemented algorithm - as most recent malwares employ - is astronomical. You would in practice be running a brute force decryption, using the known plaintext to confirm the correctness of the key and the fact that it is the same for all files (you'll only know this at the very end. Until you break the encryption of the first file, you'll never know whether you gained access to 100% of your files, or only to 0.0000001%).
So you could get the key from the comparison, but you would not be deriving the key directly from it (this can only be done if the algorithm or its implementation has flaws). As @Kevin observed, such algorithms are said to be resistant to known plaintext attack.
If you have a backup, restore everything from it. Files that are still encrypted and for which no current enough copy is available can be decrypted using the ransomware tools (i.e., surrendering and paying up), or you can try several tools that try to exploit known flaws in some ransomware implementations that expose the key, leave the original data in some recoverable form, or allow shortcuts in key bruteforcing.
Keep in mind that some ransomware authors are also behind some of these so-called "tools". At the very least, they should be purchased using a capped, traceable credit card with limited balance (e.g. a prepaid).
I imagine you've already concluded that a single user account capable of accessing all those ten million files, in the hands of someone not knowledgeable enough to realize something suspicious is happening - you don't encrypt 10 M files with a snap of the fingers - is a Very Bad Thing.
To date, this kind of malware has few attack vectors and they're almost all based on exploiting unnecessary user privileges. Removing or thwarting these privileges will effectively defang most malware of this kind, and it might be a good idea to review local and group security policies in large organizations (also, place some limits/checks on BYOD policies. I've heard it rumored that some malware variants will go stealth and wait before encrypting, depending on how many files/network shares they are able to "see". I could not verify this, but I think the idea is not difficult to have, and even if it's not true now and would require a different approach in the attack - go resident instead of running straight away - it might well become reality in the future).
Traffic analysis - just policing bandwidth to the various workstations - should have alerted that something was afoot and even pinpointed the culprit, even if this would have come too late for a lot of the files.
Further, if most files have not changed, then more frequent incremental backups are in order. If only 1% of the files have changed, it means that with the same resources you can run incremental backups with a frequency two orders of magnitude greater (actually it's not so straightforward, but still).