
I took the same list provided by this post and added a few more extensions to the search. In doing so I was able to successfully download 2327/2542 NATIVE files. I performed this search by making HEAD requests for each URL before trying to download them with a GET request. This search method resulted in me finding an additional 3 files that gave Content-Type and Content-Length in the HEAD response but ultimately “disappeared” and gave a 404 when performing a GET response.
NOTE:
-
All MS office files (.doc(x), .xls(x), .ppt(x)) are exactly ZERO bytes long.
-
There are two sqlite .db files which are password protected and I have not yet tried to crack.
-
Lots of jail footage
-
I think very small .avi videos which many sequential Bates numbers are actually single frames that need to be recombined into the original video. I have not done so.
Extensions I tried:
dataset10:
avi, mp4, mov, mp3, wav, m4a, m4v, wmv, ts, vob, 3gp, amr, opus, csv, xlsx, xls, docx, doc, pluginpayloadattachment
common-audio:
m4a, mp3, wav, aac, flac, ogg, wma, aiff, opus, m4b
common-video:
mp4, mov, avi, wmv, mkv, webm, m4v, mpg, mpeg, 3gp
uncommon-audio:
ac3, amr, mka, au, ra, mid, aif, dts, caf, gsm, ape, wv, spx, mpc, snd, voc, tta, tak, dsf, dff
uncommon-video:
flv, vob, ts, ogv, m2ts, mts, asf, 3g2, f4v, divx, rm, rmvb, m2v, dv, xvid, swf, m4s, hevc, h264, h265
rare-audio:
8svx, amb, au, avr, cda, cvs, cvsd, cvu, dss, dvms, fap, fssd, gsrt, hcom, htk, ima, ircam, maud, nist, paf, prc, pvf, sd2, sds, sf, smp, sou, txw, vms, w64, wve, xa, aifc, al, ul, la, sb, sw, ub, uw
rare-video:
264, 265, 302, 3p2, 787, 890, aec, aep, aepx, ajp, ale, am, amc, amv, arcut, arf, avb, avc, avd, avp, avs, awlive, axm, bdm, bdmv, bik, bix, bmk, bnp, box, bs4, bsf, bu, camproj, camrec, ced, cine, cip, clpi, cmmp, cmmtpl, cmproj, cmrec, cpi, cst, cx3, d2v, d3v, dash, dat, dce, dck, dcr, dcr, ddat, dif, dir, dlx, dmb, dmsd, dmsd3d, dmsm, dmsm3d, dmss, dnc, dpa, dpg, dream, dsy, dv4, dvdmedia, dvr, dvr-ms, dvx, dxr, dzm, dzp, dzt, edl, evo, eye, f4p, fbr, fbz, fcp
documents:
pdf, doc, docx, txt, rtf, odt, xls, xlsx, csv, ppt, pptx, odp, html, htm, xml, json, md, tex, epub, mobi
images:
jpg, jpeg, png, gif, bmp, tiff, tif, webp, svg, ico, raw, cr2, nef, orf, sr2, psd, ai, eps, heic, heif
archives:
zip, rar, 7z, tar, gz, bz2, xz, iso, dmg, cab, lz, lzma, zst, lz4, sz, z, tgz, tbz2, txz, tlz, tar.gz, tar.bz2, tar.xz, tar.zst, tar.lz, tar.lzma, tar.lz4, tar.z, [tar.sz](http://tar.sz/)
epstein:
apmaster, apversion, attr, bmp, bup, dat, data, db, db-journal, doc, ds\_store, f catalog, f\_catalog, ifo, images #1, images #2, iphoto, ivc, mpg, NULL, pdf, pps, ps, psb, psd, raf, tif, tiff, tropez, txt, xml
Torrent file: https://archive.org/details/data-set-9-native.tar.xz
NOTE: See INFO folder for more information.

At the moment this is the source I’ve been relying on. Not an exact file list per se but a comprehensive list of dataset magnets:
https://github.com/yung-megafone/Epstein-Files