Commit Graph

1415 Commits

Author SHA1 Message Date
aszlig
7b5263e1a6
tesseract: Package version 4.x from Git master
Tesseract 4 has got a new long short-term memory neural networking based
OCR engine which really helps a lot in terms of accuracy and our VM
tests.

I ran the new version across a bunch of different screenshots and
comparing the results to the 3.x branch and it really makes a big
difference, especially with various font rendering settings.

The only downside of this is that version 4 hasn't been released yet and
is in alpha state right now, but it will eventually get there and the
only solutions that came into my mind sticking to version 3 were really
sub-par:

 * Use several passes with different color negation on the screenshots.
 * Train Tesseract 3 specifically for screenshots. This is sub-par
   because we'd need to do it for Tesseract 4 from scratch again.
 * Change the test systems so that it specifically uses *only* OCR an
   font when displaying. I've actually tried this but this also isn't
   accurate enough with our default font rendering setup.
 * Turn off special font rendering settings for our tests. In
   conjunction with changing to an OCR font this might work but it won't
   catch all the cases, because applications might use their own font
   rendering.

Given that version 4 is faster[1] when it comes to OCR detection and also
the points just mentioned I think even using the alpha version just for
tests isn't going to hurt anybody.

[1]: https://github.com/tesseract-ocr/tesseract/wiki/4.0-Accuracy-and-Performance

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
2017-04-11 03:21:46 +02:00
aszlig
c381fa9b63
tesseract: 3.04.01 -> 3.05.00
Upstream changelog:

 * Made some fine tuning to the hOCR output.
 * Added TSV as another optional output format.
 * Fixed ABI break introduced in 3.04.00 with the AnalyseLayout()
   method.
 * text2image tool - Enable all OpenType ligatures available in a font.
   This feature requires Pango 1.38 or newer.
 * Training tools - Replaced asserts with tprintf() and exit(1).
 * Fixed Cygwin compatibility.
 * Improved multipage tiff processing.
 * Improved the embedded pdf font (pdf.ttf).
 * Enable selection of OCR engine mode from command line.
 * Changed tesseract command line parameter '-psm' to '--psm'.
 * Added new C API for orientation and script detection, removed the old
   one.
 * Increased minimum autoconf version to 2.59.
 * Removed dead code.
 * Fixed many compiler warning.
 * Fixed memory and resource leaks.
 * Fixed some issues with the 'Cube' OCR engine.
 * Fixed some openCL issues.
 * Added option to build Tesseract with CMake build system.
 * Implemented CPPAN support for easy Windows building.

The upstream URL of the change log is:

https://github.com/tesseract-ocr/tesseract/releases/tag/3.05.00

Tested by building against the following packages that directly depend
on it:

 * vapoursynth (with ocrSupport = true)
 * pyocr (fails)
 * vobsub2srt

Also tested against the following NixOS VM tests that have OCR enabled:

 * nixos/tests/chromium.nix -A stable
 * nixos/tests/emacs-daemon.nix
 * nixos/tests/installer.nix -A luksroot
 * nixos/tests/lightdm.nix
 * nixos/tests/plasma5.nix
 * nixos/tests/sddm.nix

All of the packages and tests except pyocr build/succeed on
x86_64-linux.

Fixing pyocr is outside of the scope of this commit and will happen very
soon.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
2017-04-11 03:21:32 +02:00
aszlig
288a79187c
tesseract: Reintroduce enableLanguages
I've removed that attribute in 68bc260ca2,
because the language files no longer were distributed as seperate files,
but if we for example only want to use the English training data, the
closure size of Tesseract gets quite large (around 1.2 GB), which is a
bit much just to be able to run NixOS VM tests.

For this reason I've also switched the VM tests back to using only the
English language.

Tested using the following VM tests (the ones that have OCR enabled) on
x86_64-linux:

 * nixos/tests/chromium.nix -A stable
 * nixos/tests/emacs-daemon.nix
 * nixos/tests/installer.nix -A luksroot
 * nixos/tests/lightdm.nix
 * nixos/tests/plasma5.nix
 * nixos/tests/sddm.nix

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
2017-04-11 03:21:26 +02:00
Lancelot SIX
045ecd11f8 Merge pull request #24785 from paperdigits/darktable-2.2.4
darktable: 2.2.3 -> 2.2.4
2017-04-10 13:18:11 +02:00
Daiderd Jordan
f8230518a2 Merge pull request #24762 from matthewbauer/darwin-misc-fixes
darwin: miscellaneous fixes
2017-04-10 08:50:15 +02:00
Mica Semrick
eae15ab771 darktable: 2.2.3 -> 2.2.4 2017-04-09 15:58:06 -07:00
Franz Pletz
e798712da7 Merge pull request #24759 from matthewbauer/inkscape-darwin-fix
inkscape: fix missing library error
2017-04-09 10:56:20 +02:00
Matthew Bauer
c344f46321
djview: fix macOS build 2017-04-08 23:28:00 -05:00
Matthew Bauer
ba78c50069
inkscape: fix missing library error 2017-04-08 23:20:10 -05:00
Laverne Schrock
b70b1b1f06 shotwell: 0.25.90 -> 0.26.0
Simple version bump.
2017-04-06 15:30:03 -05:00
Lprndn
75319eb203
nomacs: 3.4 -> 3.6.1
fixes #24589
2017-04-03 21:21:05 +02:00
Taahir Ahmed
438ac662aa nomacs: init at 3.4 (#24580)
* nomacs: init at 3.4

* nomacs: add gsettings for gtk open dialogs

* nomacs: use fetchurl instead of fetchFromGitHub
2017-04-03 10:28:34 +02:00
Robin Gloster
62303628ce
vimiv: mark as broken
cc @aszlig
2017-03-30 16:23:35 +02:00
Vladimír Čunát
96d41e393d
treewide: purge maintainers.urkud
It's sad, but he's been inactive for the last five years.
Keeping such people in meta.maintainers is counter-productive.
2017-03-27 19:52:29 +02:00
Thomas Tuegel
8b50f4c990 Merge pull request #24299 from ttuegel/master--drop-qt-5.7
Drop Qt 5.5 and Qt 5.7 from master
2017-03-26 09:18:38 -05:00
Michael Raskin
7b706900e7 graphicsmagick: patch for CVE-2017-6335 2017-03-25 21:04:08 +01:00
Thomas Tuegel
e6dc95697a
rapcad: pin to Qt 5.6 2017-03-25 09:23:52 -05:00
Thomas Tuegel
5044ceb7e7
rapcad: broken on Qt 5.6 2017-03-25 08:49:39 -05:00
ndowens
6b9471f32b feh: Remove un-needed libPath 2017-03-21 16:27:55 -05:00
jansol
f9e688e8a1 renderdoc: init at version 0.34pre (#23769)
* renderdoc: init at version 0.34pre

Initialising a few commits after the latest release due to some upstream
improvements to the build system.

* fix maintainer
2017-03-21 21:36:26 +01:00
Pascal Wittmann
a20fa00de7 fbida: add dependency to lirc 2017-03-21 13:39:52 +01:00
Pascal Wittmann
8aacf212ed fbida: 2.12 -> 2.13 2017-03-21 13:36:35 +01:00
Pascal Wittmann
8d721e2910 Merge pull request #24118 from mimadrid/update/yed-3.17
yed: 3.16.2.1 -> 3.17
2017-03-20 13:06:05 +01:00
mimadrid
596e10c236
yed: 3.16.2.1 -> 3.17 2017-03-20 11:40:32 +01:00
2chilled
a3b3a7ffb1 rawtherapee: 5.0-r1 -> 5.0-r1 with gtk3 support (#22911) 2017-03-20 11:39:53 +01:00
Jörg Thalheim
f66e06a828
leocad: remove unnessary patches 2017-03-19 19:18:32 +01:00
Joachim F
575cf2e17f Merge pull request #24035 from ndowens/leocad
leocad: 0.81 -> 17.02
2017-03-19 18:22:59 +01:00
ndowens
26187da45c Merge pull request #24038 from ndowens/openimageio
openimageio: 1.6.11 -> 1.7.12
2017-03-19 12:12:40 -05:00
ndowens
20e62c3b76 Merge pull request #24030 from ndowens/fontmatrix
fontmatrix: Changed URL & homepage; they no longer exist
2017-03-19 10:03:34 -05:00
ndowens
cb231786af Merge pull request #24033 from ndowens/glabels
glabels: 3.2.1 -> 3.4.0
2017-03-19 10:03:06 -05:00
Frederik Rietdijk
a4e7e0d3c1 Merge pull request #24025 from ndowens/ahoviewer
ahoviewer: 1.4.6 -> 1.4.8
2017-03-19 15:06:44 +01:00
Frederik Rietdijk
b1f7157675 Merge pull request #24024 from ndowens/alchemy
alchemy: 007 -> 008
2017-03-19 15:06:13 +01:00
Frederik Rietdijk
6405cb2ea9 Merge pull request #24034 from ndowens/jpegoptim
jpegoptim: 1.4.3 -> 1.4.4
2017-03-19 15:04:18 +01:00
ndowens
64a880faa6 djview: 4.10.5 -> 4.10.6 (#24029) 2017-03-19 14:40:15 +01:00
ndowens
705b2d9b66 feh: 2.18.1 -> 2.18.2 2017-03-19 10:45:52 +01:00
Michael Raskin
d860d9aedf Merge pull request #24043 from ndowens/potrace
potrace: 1.13 -> 1.14
2017-03-19 09:59:18 +01:00
Benjamin Staffin
85af430be3 Merge pull request #24046 from ndowens/gthumb
gthumb: 3.4.4 -> 3.5.1
2017-03-19 03:47:10 -04:00
Benjamin Staffin
3effae81f1 Merge pull request #24044 from ndowens/pqiv
pqiv: 0.12 -> 2.8.3
2017-03-19 03:44:24 -04:00
Benjamin Staffin
3b762fb201 Merge pull request #24049 from ndowens/rapcad
rapcad: 0.9.5 -> 0.9.8
2017-03-19 03:42:54 -04:00
ndowens
ec85bdb6c6 rapcad: 0.9.5 -> 0.9.8 2017-03-18 22:39:18 -05:00
ndowens
8a8b80d289 gthumb: 3.4.4 -> 3.5.1 2017-03-18 21:52:04 -05:00
ndowens
237ac13370 pqiv: 0.12 -> 2.8.3 2017-03-18 21:20:23 -05:00
ndowens
56504fcb2c potrace: 1.13 -> 1.14 2017-03-18 21:11:39 -05:00
neeasade
78a0bdfa98 meh: init at unstable-2015-04-11 2017-03-18 21:11:22 -05:00
ndowens
f5d6dd6e83 openimageio: 1.6.11 -> 1.7.12 2017-03-18 20:22:54 -05:00
ndowens
7b1e1f3cd7 leocad: 0.81 -> 17.02 2017-03-18 19:32:30 -05:00
ndowens
7364b6c252 jpegoptim: 1.4.3 -> 1.4.4 2017-03-18 19:09:17 -05:00
ndowens
3dfd03b382 glabels: 3.2.1 -> 3.4.0 2017-03-18 18:38:47 -05:00
ndowens
4024c6354e fontmatrix: Changed URL & homepage; they no longer exist 2017-03-18 18:30:57 -05:00
ndowens
35e7df6bee ahoviewer: 1.4.6 -> 1.4.8 2017-03-18 16:42:47 -05:00