Talk:RSA code: Difference between revisions

Content added Content deleted

Inline

Revision as of 18:08, 24 April 2011

Draft

There's a lot to do to clean this up to the level of being a full task. For one thing, it's not at all clear what this code is trying to do! It also lacks links to resources (e.g., wikipedia) that describe the algorithm involved, and the Python solution needs much work too (especially in its surrounding descriptive text, which looks to me more like text that belongs here on the talk page). –Donal Fellows 06:31, 24 March 2011 (UTC)

+1 on Dkf's comments. How to perform the task needs to be in the task description in a more language neutral form. --Paddy3118 13:12, 24 March 2011 (UTC)

Sorry about the shoddy state of my code, it's one of the first programs i have written on my own, and there are probably far more efficient ways of performing many functions that i just brute forced my way past. But it works, and i am rather proud of it, ugly as it may be. I have added detail about the function of the code and the way the RSA algorithm works. Please let me know if there is anything else i can clear up! --Erasmus 03:00, 26 March 2011 (UTC)

I believe my code is now annotated enough so that it makes sense to read it. I am streamlining (at least as much as i can) a program to generate new keys to use, i will upload that when i am done --Erasmus 02:21, 29 March 2011 (UTC)

OK, things are starting to look better. We still lack a description of what the task is though! I guess it's to do the encryption and decryption (with fixed trivial keys?) and not to do key generation; the latter is much more computationally expensive and doesn't serve as an introduction to this area of cryptography. –Donal Fellows 07:26, 2 April 2011 (UTC)

Split the task?

And given that the task seems to be about doing RSA, does the Python example really need a hundred lines of UI? Pjt33 12:56, 2 April 2011 (UTC)

Probably not. I really like that there are two different UI approaches shown, one using a library and one not, but that's probably better done in a task more targeted for UI operations. Which we don't have enough of, IMO. They should be created if people have the relevant interest. --Michael Mol 14:22, 2 April 2011 (UTC)

The description does need enhancement and citation per Dkf

A reference to RSA for a start

A caution that the modulus is a demonstration size only and reference to something on key sizes. Real keys are much longer and according to NIST even 1024 bit keys are on their way out. I believe the correct NIST document is referenced here Key sizes. Possibly also the RSA challenges RSA Key Factoring Challenge Archive

A caution on cryptography like appears in the MD5 or MD5 implementation task (the later if I recall)

A note that the blocking is arbitrary and that real implementations just encrypt the the underlying binary

--Dgamey 10:44, 21 April 2011 (UTC)

If blocking is arbitrary, perhaps blocking should be removed from the task description? The Python implementations currently use a block size of 1 character, and the blocking code adds some significant complexity, so why not just get rid of it? --Rdm 11:45, 22 April 2011 (UTC)

I've been thinking about this and I see two tasks here. Let me explain. (I added the subtitle above)

RSA should be about encryption/decryption. The problem is the demonstration numbers are too small for multiple ascii characters. Finding one or two sets of larger numbers would allow for this and remove the need for the blocking and some of the resulting constraints. RSA has no problem with zeros. Nor does it char about ascii, ebcdic, double-byte, etc. RSA also requires bignum support so having some numbers that exercise that would make sense. Since some languages can't do this having several sets of key pairs of regular and bignum size would allow everyone to play. Calling out the limitation should be enough. I was poking around looking for some, thinking of the smaller RSA challenges (140 bit) but haven't yet found the whole set of numbers. I think running a test message through a couple of different RSA key pairs would make a fine task. Perhaps generating some small keys would be a good sister task.

Some links.

java key gen demo

js key gen demo

prime number source - I've requested a few over 5 billion and will post some

Character blocking/unblocking could be a whole other task. In fact we could clone the page and mark everything needs improvement (get the rsa out). This task needs more specification. The example text would encode ha as 0801 which is a character set size of 100 not the 31 or 32 shown. I think some of the code examples used base 30ish. This kind of task is common but unrelated to encryption. Older operating systems used to encode characters in 6 rather than 8 bits yielding 4 characters per 3 bytes, printable mappings as well.

I also agree the huge UI thing didn't belong here. Possible another task.

--Dgamey 14:15, 22 April 2011 (UTC)

RSA129 as an example

It seems the RSA 129 challenge performed a character encoding as well ... hmmm

RSA 129 factors

The RSA 129 text "The Magic Words are Squeamish Ossifrage"

[[http://www.math.okstate.edu/~wrightd/numthry/rsa129.html summaries of rsa 129 status reports and full details of keys (see below)

Modular exponentiation will challenge any naive implementation of RSA. Your bignums may blow up decrypting the message above.

-Dgamey 15:56, 22 April 2011 (UTC) update: --Dgamey 18:08, 24 April 2011 (UTC)

RSA 129 - Final Answer (April 27, 1994)

We are happy to announce that

RSA-129 = 11438162575788886766923577997614661201021 82967212423625625618429357069352457338978 30597123563958705058989075147599290026879543541

= 3490529510847650949147849619903898133417764638493387843990820577 * 32769132993266709549961988190834461413177642967992942539798288533

The encoded message published was

968696137546220614771409222543558829057599911245743198746951209 30816298225145708356931476622883989628013391990551829945157815154

This number came from an RSA encryption of the `secret' message using the public exponent 9007. When decrypted with he `secret' exponent

106698614368578024442868771328920154780709906633937862801226224 496631063125911774470873340168597462306553968544513277109053606095

this becomes

20080500130107090300231518041900011805 0019172105011309190800151919090618010705

Using the decoding scheme 01=A, 02=B, ..., 26=Z, and 00 a space between words, the decoded message reads

THE MAGIC WORDS ARE SQUEAMISH OSSIFRAGE

To find the factorization of RSA-129, we used the double large prime variation of the multiple polynomial quadratic sieve factoring method. The sieving step took approximately 5000 mips years, and was carried out in 8 months by about 600 volunteers from more than 20 countries, on all continents except Antarctica. Combining the partial relations produced a sparse matrix of 569466 rows and 524338 columns. This matrix was reduced to a dense matrix of 188614 rows and 188160 columns using structured Gaussian elimination. Ordinary Gaussian elimination on this matrix, consisting of 35489610240 bits (4.13 gigabyte), took 45 hours on a 16K MasPar MP-1 massively parallel computer. The first three dependencies all turned out to be `unlucky' and produced the trivial factor RSA-129. The fourth dependency produced the above factorization.

We would like to thank everyone who contributed their time and effort to this project. Without your help this would not have been possible.

Derek Atkins

Michael Graff

Arjen Lenstra

Paul Leyland

--Dgamey 15:39, 24 April 2011 (UTC)

Blocking?

"This yields two blocks of numbers ..." I can see that a series of numbers are produced, but the method of splitting into blocks is not given. --Paddy3118 02:45, 26 March 2011 (UTC)

The blocking code is more complicated than the encryption code. But:

When converting letters to numbers, the numbers should be non-zero.
If if X is the largest number value, then pick the largest K such that (1+X)^K < N (where N is from the key). K is the number of letters represented in a block.
If v is an array of numbers repesenting the letters in a block, the numeric value of the block itself is can be computed (using psuedo-C or maybe psuedo-javascript):

   block= 0;
   for (int i= 0; i<v.length; i++) {
       block= v[i]+(X+1)*block;
   }

To go the other direction:

   for (int i= v.length-1; i >= 0; i--) {
       v[i]= block % (X+1);  /* remainder function corresponding to division on next line */
       block= block / (X+1); /* integer division */
   }

--Rdm 03:31, 26 March 2011 (UTC)

RSA in J

I may simply be naive, but it seems to me that the example doesn't code "hi there" correctly. The output is given as "695 153 2377 260". I tried to decrypt those blocks, but i can only recover gibberish from them. Even abandoning my python code and simply doing the modular exponentiation yields the same results. I get "575, 1230, 2387, 205" if i split "hi there" into 4 blocks. Can you explain how exactly you are encoding and decoding? I am not familiar with J, so i cant really tell whats going on in your code. --Erasmus 03:04, 30 March 2011 (UTC)

I believe the pseudo code I posted, above, matches the process I used in J, though I cannot actually run the pseudo code, and it might be buggy. But I have updated the J entry with a blocking example. So, for example: 'h' is letter 8 and 'i' is letter 9 and 'hi' is thus (32*8)+9 or 265. But perhaps we should be using a different blocking algorithm? --Rdm 12:26, 30 March 2011 (UTC)

I should perhaps also note that I cannot get the python example to work. It fails for me with the message ImportError: No module named tkinter. But these work for me:

<lang python>>>> import _tkinter

>>> import Tkinter >>> Tkinter._test()</lang>

And the python implementation does not supply any canned examples.

So, anyways, I have not tested against the python code. --Rdm 17:35, 30 March 2011 (UTC)

Ah...that explains it. Our blocking methods were just entirely different. I recovered 265, but i didnt know how to turn that back to letters. My code blocks "hi" as "0809", since "h" is "08" and "i" is "09". If "f" is the number of digits in "n", my python code only places (f/2)-1 letters per block. I realize this doesn't make sense for a small n, but at larger n it isn't noticeable, and it prevents my blocks from being larger than "n".

I have uploaded a version of my code that doesn't use the Tkinter window. I find running it from python IDLE seems to work best, since i can copy paste plaintext or ciphertext to and from the window. --Erasmus 01:18, 1 April 2011 (UTC)

Ok, I think I understand your approach now. The reason I went the way I did was because I felt "letters" was somewhat arbitrary. I could, in principle, use arbitrary ascii or even unicode. (If I used utf8 instead of wide characters, some characters would get split across multiple blocks, but except for the garbage padding at the end, that should not be a problem. I would need to do something special to deal with the errors that could be generated by using arbitrary codes in the padding at the end, but I have not thought that far ahead.) That said, I see that the task description has changed, so I will need to update the J implementation. --Rdm 11:55, 1 April 2011 (UTC)

Actually, I have a problem with this blocking mechanism. N is 2537 while ' t' encodes as 3120, which means that the RSA algorithm decodes it as 583. It's possible to work around this for cases where the decoding produces a non-legal character, but I believe that that kind of special case code makes this not be a general case implementation of RSA. However, I am still not able to test my example phrase ('hi there') against your code -- this time I had to do a fresh python install (the machine I was using did not have it installed) and after installing it and running your code, I get an error when I try to use the new program:

<lang python>Traceback (most recent call last):

 File "rsa.py", line 154, in <module>
   if mm.lower() == 'encrypt':

AttributeError: 'function' object has no attribute 'lower'</lang> --Rdm 12:34, 1 April 2011 (UTC)

Ok, so I editted all instances of .lower out of the python code. However, I should note that you are apparently only using one letter per block. That means that the theoretical problem I mentioned is not an issue for you. But that also means that your python implementation is just doing a simple letter substitution code. --Rdm 12:53, 1 April 2011 (UTC)

The .lower() may be something added in python 3. I forgot to mention that being new to coding, nobody told me that python 2 was still more widely used, so this is all done in the newest version of python. The .lower() function just changes a string to all lower case letters, so it took out the case sensitivity when my code tries to pair the letter with a number.

For larger n, the blocks grow. Since this n is only 4 digits long, each block is one letter. If n were 20 or 21 digits, each block would contain 9 characters. That was my way of making sure the blocks were never larger than n, since each character for me converts to 2 digits. --Erasmus 00:03, 2 April 2011 (UTC)

~

Ok, but as specified, the task is to substitute the sequence of letters 'abcdefghijklmnopqrstuvwxyz,.!? ' and replace each of them with the corresponding number from this list:

1 581 1087 140 205 2371 1125 156 1864 2403 821 2497 2038 1616 2116 1841 959 2222 2299 793 41 45 2475 2130 1433 1836 1642 206 577 1488 384

. Decryption is looking up those numbers and substituting for the corresponding letters. The padding is irrelevant noise. And, as specified, you do not have to implement RSA at all to accomplish this task. --Rdm 00:58, 2 April 2011 (UTC)

@@ Line 44: / Line 44: @@
 : [http://www.willamette.edu/~mjaneba/rsa129.html RSA 129 factors]
-: [[wp:The_Magic_Words_are_Squeamish_Ossifrage| The RSA 129 text]]
+: [[wp:The_Magic_Words_are_Squeamish_Ossifrage|The RSA 129 text "The Magic Words are Squeamish Ossifrage"]]
 : [[http://www.math.okstate.edu/~wrightd/numthry/rsa129.html summaries of rsa 129 status reports and full details of keys (see below)
+: [[wp:Modular_exponentiation|Modular exponentiation]] will challenge any naive implementation of RSA.  Your bignums may blow up decrypting the message above.
--[[User:Dgamey|Dgamey]] 15:56, 22 April 2011 (UTC) update: --[[User:Dgamey|Dgamey]] 15:39, 24 April 2011 (UTC)
+-[[User:Dgamey|Dgamey]] 15:56, 22 April 2011 (UTC) update: --[[User:Dgamey|Dgamey]] 18:08, 24 April 2011 (UTC)
 === RSA 129 - Final Answer (April 27, 1994) ===