Commit Graph

105381 Commits

Author SHA1 Message Date
Daniel Brockman
881595ac03 ethrun: init at 0.1.0 2017-04-11 06:33:16 +02:00
aszlig
5d5c0d590f
Revert "sddm: Fix test."
This reverts commit 0a6a06346a.

The commit replaced the text to search for from ALICE to BOB, because
our OCR detection only caught "BOB FOOBAR" but missed "ALICE FOOBAR"
completely.

With the improvements to our OCR system this no longer is the case and
the test passes successfully with this reverted.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Cc: @shlevy
2017-04-11 03:21:58 +02:00
aszlig
a443bdc0a6
nixos/testing: Improve quality of OCR
First of all, we're now using ImageMagick to improve the screenshot so
that Tesseract has an esier time to recognize the text. The resulting
image of this post-processing is a scaled up black-and-white version
with the backgrounds almost entirely removed and the text edges a bit
blurred, so the screen shots now more or less resemble an image from a
scanner rather. This is what Tesseract is trained for by default.

As mentioned in the previous commit we now also use Tesseract 4, which
further improves the quality of text recognition.

I've spent countless hours just to test different postprocessing
variants and testing what works best for our tests and this is the one
that worked best so far. It's certainly not perfect and I'd like to
avoid the scaling step but we're way better off than before.

In addition to this, the OCR process is now done without an intermediate
file, solely using pipes.

I've tested this using the following VM tests which have OCR enabled:

 * nixos/tests/chromium.nix -A stable
 * nixos/tests/emacs-daemon.nix
 * nixos/tests/installer.nix -A luksroot
 * nixos/tests/lightdm.nix
 * nixos/tests/plasma5.nix
 * nixos/tests/sddm.nix

All of the tests still succeed and comparing some of the recognition
results to the earlier results it now also detects a lot more text than
before this commit.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
2017-04-11 03:21:53 +02:00
aszlig
7b5263e1a6
tesseract: Package version 4.x from Git master
Tesseract 4 has got a new long short-term memory neural networking based
OCR engine which really helps a lot in terms of accuracy and our VM
tests.

I ran the new version across a bunch of different screenshots and
comparing the results to the 3.x branch and it really makes a big
difference, especially with various font rendering settings.

The only downside of this is that version 4 hasn't been released yet and
is in alpha state right now, but it will eventually get there and the
only solutions that came into my mind sticking to version 3 were really
sub-par:

 * Use several passes with different color negation on the screenshots.
 * Train Tesseract 3 specifically for screenshots. This is sub-par
   because we'd need to do it for Tesseract 4 from scratch again.
 * Change the test systems so that it specifically uses *only* OCR an
   font when displaying. I've actually tried this but this also isn't
   accurate enough with our default font rendering setup.
 * Turn off special font rendering settings for our tests. In
   conjunction with changing to an OCR font this might work but it won't
   catch all the cases, because applications might use their own font
   rendering.

Given that version 4 is faster[1] when it comes to OCR detection and also
the points just mentioned I think even using the alpha version just for
tests isn't going to hurt anybody.

[1]: https://github.com/tesseract-ocr/tesseract/wiki/4.0-Accuracy-and-Performance

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
2017-04-11 03:21:46 +02:00
aszlig
49cf934642
pyocr: Add patch to support Tesseract 3.05.00
This is from the commit message I've written for the upstream pull
request (jflesch/pyocr#62):

    This is a bit more involved, because Tesseract 3.05.00 comes not
    only with improvements but also with a few quirks we need to deal
    with.

    The first quirk is that the order arguments of the `tesseract'
    command now matters and the list of configurations has to be at the
    end of the command line. So we add a new attribute tesseract_flags
    to the BaseBuilder class that contains a list of all the flags to
    pass to `tesseract', the tesseract_configs attribute however remains
    pretty much the same but now only really contains a list of configs
    instead of being mixed with flag arguments.

    Another quirk has to do with Leptonica >= 1.74 which Tesseract
    3.05.00 now requires. Leptonica has special handling of files that
    reside in /tmp and assumes that it's an internal temporary file of
    Leptonica. In order to deal with it, we now run Tesseract in a
    temporary directory, which contains the input/output files and use
    the relative name of these files because Leptonica only searches for
    path names beginning with /tmp.

    Fortunately the last item we need to address is not really a quirk,
    but an API change. In Tesseract 3.05.00 there is now a new function
    called TessBaseAPIDetectOrientationScript(), which doesn't fill the
    OSResults object anymore but now allows to pass the values we're
    interested in directly by reference. We need to use this new
    function because the old function TessBaseAPIDetectOS() now *always*
    returns false.

I've tested this specifically on NixOS and in conjunction with Paperwork
(the only package that's using pyocr so far) and all the tests of the
dependency chain are now succeeding. However, I didn't do manual tests
of Paperwork though.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
2017-04-11 03:21:39 +02:00
aszlig
121751e10f
pyocr: 0.4.4 -> 0.4.6
Upstream changes for version 0.4.5:

 * Clean up exceptions raised when OCR fails:
 * Now, all tools raise only exceptions inheriting from
   pyocr.PyocrException
 * There is now one and only one TesseractError (shared between
   pyocr.libtesseract and pyocr.tesseract)

Upstream changes for version 0.4.6:

 * hOCR outputs: Generate valid XHTML files

The full upstream changelog can be found at:

https://github.com/jflesch/pyocr/blob/master/ChangeLog

Note that because of the version bump of Tesseract neither version 0.4.4
nor version 0.4.6 succeed to build, so we need to fix this up soon.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
2017-04-11 03:21:36 +02:00
aszlig
c381fa9b63
tesseract: 3.04.01 -> 3.05.00
Upstream changelog:

 * Made some fine tuning to the hOCR output.
 * Added TSV as another optional output format.
 * Fixed ABI break introduced in 3.04.00 with the AnalyseLayout()
   method.
 * text2image tool - Enable all OpenType ligatures available in a font.
   This feature requires Pango 1.38 or newer.
 * Training tools - Replaced asserts with tprintf() and exit(1).
 * Fixed Cygwin compatibility.
 * Improved multipage tiff processing.
 * Improved the embedded pdf font (pdf.ttf).
 * Enable selection of OCR engine mode from command line.
 * Changed tesseract command line parameter '-psm' to '--psm'.
 * Added new C API for orientation and script detection, removed the old
   one.
 * Increased minimum autoconf version to 2.59.
 * Removed dead code.
 * Fixed many compiler warning.
 * Fixed memory and resource leaks.
 * Fixed some issues with the 'Cube' OCR engine.
 * Fixed some openCL issues.
 * Added option to build Tesseract with CMake build system.
 * Implemented CPPAN support for easy Windows building.

The upstream URL of the change log is:

https://github.com/tesseract-ocr/tesseract/releases/tag/3.05.00

Tested by building against the following packages that directly depend
on it:

 * vapoursynth (with ocrSupport = true)
 * pyocr (fails)
 * vobsub2srt

Also tested against the following NixOS VM tests that have OCR enabled:

 * nixos/tests/chromium.nix -A stable
 * nixos/tests/emacs-daemon.nix
 * nixos/tests/installer.nix -A luksroot
 * nixos/tests/lightdm.nix
 * nixos/tests/plasma5.nix
 * nixos/tests/sddm.nix

All of the packages and tests except pyocr build/succeed on
x86_64-linux.

Fixing pyocr is outside of the scope of this commit and will happen very
soon.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
2017-04-11 03:21:32 +02:00
aszlig
42bb63f803
leptonica: 1.72 -> 1.74.1
The changes are a bit too big to include it here in the commit message,
so if you want the details of what changed, please visit this URL:

http://leptonica.org/source/version-notes.html

I have also provided openjpeg, giflib and libwebp as dependencies so
that Leptonica is able to read/write those file formats.

Additionally I've added a patch that uses pkgconfig to resolve all
dependencies (except giflib), because unlike AC_CHECK_LIB() the
PKG_CHECK_MODULES() macro defines *_LIBS variables to include the linker
search path.

Unfortunately that patch alone is not enough, because the *_LIBS
variable are substituted by the upstream configure.ac to *not* include
the linker search paths, so we need to remove the AC_SUBST() calls
within PKG_CHECK_MODULES().

The only dependency that's not yet using PKG_CHECK_MODULES() is giflib,
because giflib doesn't have a pkg-config description file, therefore
we're using substituteInPlace to insert the linker search path after the
lept.pc file was generated by configure.

Another thing that we no longer need is the dependency on libpng version
1.2, because Leptonica now also works with more recent libpng versions.

Tested by building the package itself and also the following packages
that immediately depend on leptonica:

 * k2pdfopt
 * tesseract
 * jbig2enc

All of these packages succeeded to build on x86_64-linux.

The main reason why I'm bumping Leptonica to version 1.74.1 is that we
need at least version 1.74 to bump Tesseract to the latest upstream
version.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
2017-04-11 03:21:29 +02:00
aszlig
288a79187c
tesseract: Reintroduce enableLanguages
I've removed that attribute in 68bc260ca2,
because the language files no longer were distributed as seperate files,
but if we for example only want to use the English training data, the
closure size of Tesseract gets quite large (around 1.2 GB), which is a
bit much just to be able to run NixOS VM tests.

For this reason I've also switched the VM tests back to using only the
English language.

Tested using the following VM tests (the ones that have OCR enabled) on
x86_64-linux:

 * nixos/tests/chromium.nix -A stable
 * nixos/tests/emacs-daemon.nix
 * nixos/tests/installer.nix -A luksroot
 * nixos/tests/lightdm.nix
 * nixos/tests/plasma5.nix
 * nixos/tests/sddm.nix

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
2017-04-11 03:21:26 +02:00
Nikolay Amiantov
c8c340b05a tlp service: mask systemd-rfkill
Fixes #24737.
2017-04-11 02:09:29 +03:00
Philipp Steinpass
eec5775a4c steam: move libpciaccess as non-runtime dependencies 2017-04-11 01:51:46 +03:00
Aristid Breitkreuz
6f3b15e3c1 Merge pull request #24798 from avnik/wine-update
wineUnstable: 2.4 -> 2.5
2017-04-10 23:26:50 +02:00
Aristid Breitkreuz
7813c9676a Merge pull request #24802 from asymmetric/awscli
awscli: 1.11.45 -> 1.11.75
2017-04-10 22:55:13 +02:00
John Ericson
2b85b38b1f Merge pull request #24804 from Ericson2314/platform-whitespace
top-level/platforms.nix: Reformat and clean up whitespace
2017-04-10 16:30:43 -04:00
John Ericson
f3055a3c50 top-level/platforms.nix: Reformat and clean up whitespace 2017-04-10 15:39:47 -04:00
Thomas Tuegel
c7dd8a707b
golden-cheetah: fix build
- Use Qt 5.6 to fix compile error.
- Run preInstall and postInstall hooks to fix linking error.
2017-04-10 13:51:45 -05:00
Lorenzo Manacorda
8d18f67a97 awscli: 1.11.45 -> 1.11.75
also update dependency botocore
2017-04-10 18:56:06 +02:00
Shea Levy
1bb8a47803 Add aggregate job for a forthcoming nixpkgs-darwin-unstable channel 2017-04-10 12:35:32 -04:00
Eelco Dolstra
0e0e7c1a8f Merge pull request #24787 from abbradar/gtk3-firefox
GTK3 by default in Firefox and Thunderbird
2017-04-10 18:11:06 +02:00
Alexander V. Nikolaev
3ec56d8da2 wineUnstable: 2.4 -> 2.5 2017-04-10 18:37:28 +03:00
Thomas Tuegel
33194ec649
dropbox: 23.4.17 -> 23.4.18
This update has not been officially announced upstream, but version 23.4.17 no
longer works.
2017-04-10 09:28:01 -05:00
Tuomas Tynkkynen
c0cef0425e btrfs-progs: 4.8.2 -> 4.10.2 2017-04-10 17:09:22 +03:00
Tuomas Tynkkynen
199be99c6f dosfstools: 3.0.28 -> 4.1 2017-04-10 17:09:22 +03:00
Tuomas Tynkkynen
183279002d f2fs-tools: 1.7.0 -> 1.8.0 2017-04-10 17:09:22 +03:00
Tuomas Tynkkynen
c334daa5c2 f2fs-tools: Cleanup a bit 2017-04-10 17:09:22 +03:00
Lorenzo Manacorda
5108c4c7b2 notmuch: fix homepage and notmuch-mutt license (#24777)
* notmuch: fix homepage and notmuch-mutt license

notmuch-mutt's license is GPLv3. might have been changed when it was upstreamed.

* fix scheme

* fix typo in url

* fix field alignment

* use with to make statements shorter
2017-04-10 16:00:25 +02:00
Jörg Thalheim
92ab8b0ee7 Merge pull request #24782 from asymmetric/polybar
polybar: 3.0.4 -> 3.0.5
2017-04-10 15:58:29 +02:00
Franz Pletz
f1f9020224
crowd service: fix secure sso cookies
Crowd didn't detect a secure connection before.
2017-04-10 15:39:37 +02:00
Yann Hodique
a78ce1d4c6 tig: 2.2 -> 2.2.1 (#24770)
* tig: 2.2 -> 2.2.1

Also move to different project URLs, as requested in
https://github.com/jonas/tig/releases/tag/tig-2.2.1

* tig: fix fetching mechanism

Rework the dependencies to allow use of fetchFromGitHub.
2017-04-10 15:03:04 +02:00
Tim Steinbach
0358bf2f92 Merge pull request #24794 from NeQuissimus/sbt_0_13_15
sbt: 0.13.14 -> 0.13.15
2017-04-10 08:45:17 -04:00
Tim Steinbach
5a3dca24d8
sbt: 0.13.14 -> 0.13.15 2017-04-10 08:44:32 -04:00
Tim Steinbach
205abc1fb6
linux: 4.11-rc5 -> 4.11-rc6 2017-04-10 08:34:23 -04:00
Franz Pletz
6049560e60
jenkins: 2.49 -> 2.53 2017-04-10 14:31:27 +02:00
Franz Pletz
4f0dd2f746
prometheus service: add scrapeConfigs.params option 2017-04-10 14:31:27 +02:00
Franz Pletz
b4c7979363
libgit2: 0.24.6 -> 0.25.1 2017-04-10 14:31:26 +02:00
Jörg Thalheim
cbe0062325
wireguard: 0.0.20170324 -> 0.0.20170409 2017-04-10 14:27:21 +02:00
Tobias Geerinckx-Rice
f5fe20c7f4 Merge pull request #24773 from cko/maven-3_5_0
maven: 3.3.9 -> 3.5.0
2017-04-10 13:02:57 +01:00
Lancelot SIX
045ecd11f8 Merge pull request #24785 from paperdigits/darktable-2.2.4
darktable: 2.2.3 -> 2.2.4
2017-04-10 13:18:11 +02:00
Jörg Thalheim
fa4eff9b52 Merge pull request #24360 from clefru/gce-image-shrink-on-master
Shrink GCE bootstrap image to minimum size, and auto-expand it to actual size on first boot.
2017-04-10 12:01:53 +02:00
Frederik Rietdijk
90aaa7319e Merge pull request #24781 from ndowens/texstudio
texstudio: 2.11.2 > 2.12.4
2017-04-10 10:59:42 +02:00
Nikolay Amiantov
f68de22683 wrapGAppsHook: add librvsg as a dependency
User themes may use SVG icons which won't work if the app can't access this
library. This is quite sure to happen (e.g. Adwaita's icons are vector).
2017-04-10 11:37:54 +03:00
Nikolay Amiantov
ef1e28f5f6 qt56.qtwebengine: patch more library paths
Backport 040b86a96e.
2017-04-10 11:35:00 +03:00
Michael Raskin
f12bd6e9b6 lispPackage.iolib: missed one system 2017-04-10 10:09:22 +02:00
Michael Raskin
08abe4fe93 lispPackage.iolib: list the hidden systems to make sure bundles exist 2017-04-10 09:57:17 +02:00
Daiderd Jordan
f8230518a2 Merge pull request #24762 from matthewbauer/darwin-misc-fixes
darwin: miscellaneous fixes
2017-04-10 08:50:15 +02:00
Peter Hoeg
3766b87737 Merge pull request #24789 from sigma/pr/keychain-maintainer
keychain: add sigma as maintainer
2017-04-10 14:41:29 +08:00
Yann Hodique
f4ff67099b keychain: add sigma as maintainer 2017-04-09 22:28:27 -07:00
Joachim Fasting
7701cbca6b
grsecurity: 4.9.20-201703310823 -> 4.9.21-201704091948 2017-04-10 03:34:42 +02:00
Nikolay Amiantov
e2fe47008f thunderbird: enable GTK3 by default 2017-04-10 02:43:26 +03:00
Nikolay Amiantov
999cf98de9 firefox: enable GTK3 by default 2017-04-10 02:43:18 +03:00