4 comments

  • jmagland 1 hour ago
    I've found that Asymmetric Numeral Systems (you mentioned it briefly) is the optimal practical method for pure entropy encoding. I just posted this https://news.ycombinator.com/item?id=47806122
  • georgemcbay 1 hour ago
    For anyone who already has at least a surface level understanding of compression and wants to take a deeper dive, check out Charles Bloom's blog:

    http://cbloomrants.blogspot.com

    Unfortunately it has been dormant for some time but there are years worth of useful information there and he is an uncommonly good presenter of technical knowledge through the written word.

  • jgalt212 2 hours ago
    My compression algo explorations are like font explorations. I spend a lot of time doing research and testing, but I (almost) always end up coming back to gzip / arial.

    One notable exception is that for very large files (e.g. 10GB+ mbox archives), we found 7z compressed to 39% and gzip 65%. 7z was about 10% faster as well.

    • web007 28 minutes ago
      zstd beats gzip on both speed and size, for every compression level.

      If you need compatibility then gzip (pigz) or zip (7z) or bz2 (pbzip2) are the best of worse outcomes, but for Pareto front optimal speed and size you want zstd.

  • gmiller123456 1 hour ago
    [Deleted]
    • ghusbands 1 hour ago
      It has DEFLATE code, Snappy code, LZ4 code, ZSTD exploration, and describes many involved sub-algorithms, with diagrams - what more were you wanting?