Difference between revisions of "User:Maximilian Janisch/latexlist"
m (typo) |
m |
||
(12 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
+ | '''Project started in April 2019.''' | ||
+ | |||
Quoting from [[Talk:EoM:This project|EoM talk]]: | Quoting from [[Talk:EoM:This project|EoM talk]]: | ||
"This project is based on an electronic version of the 'Encyclopaedia of Mathematics', published by Kluwer Academic Publishers until 2003, and by Springer after that. The encyclopaedia goes back to the Soviet Matematicheskaya entsiklopediya (1977), originally edited by Ivan Matveevich Vinogradov. The electronic version had its formulae written in TEX, which were saved as png images. On its way through the various publishers the original TEX source code was lost, therefore, to edit a formula in one of these original pages requires to retype the code for that formula from scratch. | "This project is based on an electronic version of the 'Encyclopaedia of Mathematics', published by Kluwer Academic Publishers until 2003, and by Springer after that. The encyclopaedia goes back to the Soviet Matematicheskaya entsiklopediya (1977), originally edited by Ivan Matveevich Vinogradov. The electronic version had its formulae written in TEX, which were saved as png images. On its way through the various publishers the original TEX source code was lost, therefore, to edit a formula in one of these original pages requires to retype the code for that formula from scratch. | ||
Line 5: | Line 7: | ||
or LATEX for formulae encoding. " | or LATEX for formulae encoding. " | ||
− | Currently, there are about 270'000 images of formulas | + | Currently, there are about 270'000 images of formulas whose LaTeX source code has been lost. Many of these images are duplicates (see [[User:Maximilian Janisch/latexlist/duplicates]]), making classification easier. There are services such as [https://www.mathpix.com Mathpix] which automatically transform the images back to TEX code. However, these translations are not infallible (see [[User:Maximilian Janisch/latexlist/latex]]). |
Currently, I have classified all 270'125 images of this Encyclopedia into 103'285 classes of duplicates (some images appear hundreds of times, others just once). My goal is to translate the 103'285 images back to TEX code (remark: some of the images are not formulas but graphics of other types, so I won't re-encode those of course) with the help of Mathpix, and then replace them by the TEX code in the corresponding articles. | Currently, I have classified all 270'125 images of this Encyclopedia into 103'285 classes of duplicates (some images appear hundreds of times, others just once). My goal is to translate the 103'285 images back to TEX code (remark: some of the images are not formulas but graphics of other types, so I won't re-encode those of course) with the help of Mathpix, and then replace them by the TEX code in the corresponding articles. | ||
+ | |||
+ | '''Added March 2020:''' [[User:Ulf_Rehmann|Ulf Rehmann]] found the original Nroff codes for about four fifths of the non-texified articles. They can be translated to LaTeX automatically with very few to no errors. So we are left with only about 60'000 formulas or 23'000 classes of images to translate back "manually". | ||
+ | |||
+ | '''Added July 2020:''' The correction of the translations of the 23'000 classes were finished. Consequently, 630 articles were retexified (see below). Additionally, 3400 articles were retexified by Ulf using the Nroffs. | ||
+ | |||
+ | '''Added June 2021:''' After the translation, there remained [https://encyclopediaofmath.org/index.php?title=User:Maximilian_Janisch/latexlist/texified&oldid=50862 36 articles] that were still to texify. Thanks to further manual texifications by [[User:Redactedentity]], there are currently only 9 articles left that need to be texified. | ||
+ | |||
+ | == Texified articles == | ||
+ | See [[User:Maximilian Janisch/latexlist/texified]]. | ||
== Classified images == | == Classified images == | ||
− | See [[User:Maximilian Janisch/latexlist/latex]] | + | See [[User:Maximilian Janisch/latexlist/latex]] |
== Classified Duplicates == | == Classified Duplicates == | ||
See [[User:Maximilian Janisch/latexlist/duplicates]] | See [[User:Maximilian Janisch/latexlist/duplicates]] | ||
+ | |||
+ | == Algebraic Groups == | ||
+ | See [[User:Maximilian Janisch/latexlist/Algebraic Groups]] |
Latest revision as of 03:12, 27 June 2023
Project started in April 2019.
Quoting from EoM talk: "This project is based on an electronic version of the 'Encyclopaedia of Mathematics', published by Kluwer Academic Publishers until 2003, and by Springer after that. The encyclopaedia goes back to the Soviet Matematicheskaya entsiklopediya (1977), originally edited by Ivan Matveevich Vinogradov. The electronic version had its formulae written in TEX, which were saved as png images. On its way through the various publishers the original TEX source code was lost, therefore, to edit a formula in one of these original pages requires to retype the code for that formula from scratch.
For the project, it will be of big help to transcribe the old pages. To make this easy, it was decided to use MathJax, which allows to use Plain TEX or LATEX for formulae encoding. "
Currently, there are about 270'000 images of formulas whose LaTeX source code has been lost. Many of these images are duplicates (see User:Maximilian Janisch/latexlist/duplicates), making classification easier. There are services such as Mathpix which automatically transform the images back to TEX code. However, these translations are not infallible (see User:Maximilian Janisch/latexlist/latex).
Currently, I have classified all 270'125 images of this Encyclopedia into 103'285 classes of duplicates (some images appear hundreds of times, others just once). My goal is to translate the 103'285 images back to TEX code (remark: some of the images are not formulas but graphics of other types, so I won't re-encode those of course) with the help of Mathpix, and then replace them by the TEX code in the corresponding articles.
Added March 2020: Ulf Rehmann found the original Nroff codes for about four fifths of the non-texified articles. They can be translated to LaTeX automatically with very few to no errors. So we are left with only about 60'000 formulas or 23'000 classes of images to translate back "manually".
Added July 2020: The correction of the translations of the 23'000 classes were finished. Consequently, 630 articles were retexified (see below). Additionally, 3400 articles were retexified by Ulf using the Nroffs.
Added June 2021: After the translation, there remained 36 articles that were still to texify. Thanks to further manual texifications by User:Redactedentity, there are currently only 9 articles left that need to be texified.
Texified articles
See User:Maximilian Janisch/latexlist/texified.
Classified images
See User:Maximilian Janisch/latexlist/latex
Classified Duplicates
See User:Maximilian Janisch/latexlist/duplicates
Algebraic Groups
Maximilian Janisch/latexlist. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Maximilian_Janisch/latexlist&oldid=43803