{"slug": "what-you-need-to-know-before-touching-a-video-file", "title": "What you NEED to know before touching a video file", "summary": "The article explains that video files consist of two distinct components: a container format (like .mp4 or .mkv) that packages the data, and a separate video coding format (like H.264 or H.265) that actually compresses and encodes the video. It warns beginners that simply changing a file's extension or container does not alter the underlying video encoding, and that misunderstanding this distinction leads to common, time-consuming mistakes.", "body_md": "# What you NEED to Know Before Touching a Video File\n\nHanging out in subtitling and video re-editing communities,\nI see my fair share of novice video editors and video encoders,\nand see plenty of them make the classic beginner mistakes when it comes to working with videos.\nA man can only read \"Use Handbrake to convert your mkv to an mp4 :)\" so many times before losing it,\nso I am writing this article to channel the resulting psychic damage into something productive.\n\nIf you are new to working with videos (or, let's face it, even if you aren't),\nplease read through this guide to avoid making mistakes that can cost you lots of time, computing power, storage space, or video quality.\n\nThis guide is quite long.\nThis is hard to avoid since videos are *really, really complicated*,\nand there are lots of misconceptions to clear up.\nI have tried to keep the different sections as independent as possible so you do not have to read the whole thing at once.\n\n## The Anatomy of a Video File and Remuxing vs. Reencoding\nLet's start out with the most important thing:\nThe mistake I see the most and that causes experienced users the most pain to see.\n\nTo efficiently work with video files, you need to know the (extreme) basics of how video files are stored:\nWhen you download video files or copy them somewhere, you may come across various types of videos.\nYou'll probably see file extensions like `.mp4` or `.mkv` (or many others like `.webm`, `.mov`, `.avi`, `.m2ts`, and so on).\nAs a newcomer to video you might be tempted to think that this file extension is what determines the video format.\nYou might have found an `mkv` file somewhere and noticed that Vegas or Premiere cannot open it,\nso you searched for ways to convert your `mkv` file to an `mp4` file.\n\nWhile this is technically not wrong, it's far from the full story and can cause lots of misconceptions.\nIn reality, all these formats are so-called *container formats*.\nThe job of an `mkv` or `mp4` file is not to compress and encode the video,\nbut to take an already compressed video stream and package it in a way that makes it easier for video players to play them.\nContainer formats are responsible for tasks like storing multiple audio or subtitle tracks (or even multiple video tracks!) in the same file,\nstoring metadata like chapters or which tracks have which languages,\nand various other technical things.\nHowever, while they *store* the video (and audio), they're not the formats that actually *encode* it.\n\n*Actual* video coding formats are formats like H.264 (also known as AVC) or H.265 (also known as HEVC).\nSometimes they're also (somewhat incorrectly[^codec]) called *codecs*, short for \"encoder, decoder\".\nH.264 and H.265 are the most common coding formats, but you may also run into some others like VP9 and AV1 (e.g. in YouTube rips) or Apple ProRes.\nThese are the formats that handle the actual encoding of the video,\nwhich is the much, much, *much* harder part.\nA raw video file is *massive*, so these formats use lots of very clever and complicated tricks\nto store the video as efficiently as possible while losing as little quality as possible.\nIn particular, this means that these formats are usually *lossy*,\ni.e. that video encoding programs will cause slight changes in the video in order to be able to compress it more efficiently.\nHowever, figuring out how to make a video as small as possible while sacrificing as little quality as possible\nis *very hard*, which is why encoding a video takes a lot of time and computing power.\nThis is why rendering a video takes as long as it does.\n\n[^codec]: *Technically* the term *codec* refers to a specific *program* that can encode and decode a certain format, not the format itself, but almost nobody makes that distinction in practice. (That is, many people use \"codec\" to refer to the format itself, too. The distinction between a format and an encoding program *is* extremely important, as we'll see later.)\n\nNote that *H.264*  is different from *x264*, which you may also have heard of. *H.264* is the coding format itself,\nwhile *x264* is a specific program that can encode to H.264.\nThe same is true for H.265 and x265.\nYou will see later on in this article why this distinction matters a lot.\n\nSo, to summarize:\nA video file is actually comprised of a *container format* (like `mkv` or `mp4`), which itself contains an actual video stream.\nChanging the container format is simple: You just rip out the video stream and stick it into another container.\n(Well, it's a little more complicated than that. But the point is: The container format is not the one that encodes the actual video, so you can switch container formats without encoding the video from scratch.)\nChanging the underlying coding format, however, or recompressing the video to change the file size,\nis harder and will a) take time and computing power, and b) lose video quality.\n\nThe process of decoding a video stream and encoding it again using the same or a different coding format is called *reencoding*.\nChanging the surrounding container format, on the other hand, is called *remuxing* (Deriving from \"multiplexing\", which refers to sticking multiple audio or video streams into the same file).\n\nThis is *extremely important* to know when working with videos!\nIf you try to convert your `mkv` file to an `mp4` to open it in Premiere by sticking it into a converter like Handbrake (or, worse, some online conversion tool) without knowing what you're doing,\nyou may end up reencoding your video instead, which will not only take much, much longer,\nbut also greatly hurt your video's quality.\n\nInstead, chances are that you can just remux your video to an `mp4` instead, leaving the underlying encoded video stream untouched.\nNow, granted, there are some subtleties here, in particular to do with frame rates (more on this later),\nbut the point is: lots of simple-looking \"conversion\" methods (like Handbrake, random converter websites, etc.) will actually reencode the video,\nwhich you want to avoid as much as possible.\nKnowing how a video file is structured, and what tools you can use to work with them (again, more on this later)\nwill help you avoid many of these mistakes.\n\n## Video Quality\nNext, let's talk about the concept of \"video quality\", which I myself already invoked above.\nI don't think there is any other concept in video with as many misconceptions about it as video quality,\nand once again misunderstanding it can cause you to make many avoidable mistakes.\nThis is important for both encoding your own videos *and* for *selecting* which source footage you want to work with.\n\nHere is a list of things that people commonly associate with a video's quality:\n- Its resolution (1080p/720p/4k/etc.)\n- Its frame rate (24fps / 60fps / 144fps / etc.)\n- Its bit depth (8bit / 10bit / etc.)\n- Its file size or its bitrate (i.e. file size divided by duration)\n- Its file format (`.mkv` / `.mp4` / etc.)\n- Its video coding format (H.264 / H.265 / etc.)\n- The program used to encode the video (x264 / x265 / NVENC / etc.)\n- The settings used to encode the video\n- The video's source (Blu-ray / Web Stream / etc.)\n- The video's colors (brightness / contrast / saturation / etc.)\n- The video's color space and range (i.e. whether it's in HDR)\n- How sharp or blurry the video is\n\nIf you've paid attention in the previous section, you should know that at least some of these points, like the file format one, cannot be true (but it's still a misconception I sometimes see!).\nBut, in fact, the truth is that none of these things are necessarily related to a video's quality!\nThe program used to encode the video combined with the settings used in them gets the closest, but only in specific scenarios.\n\nWhy is this? Well, let's go through them one by one (but in a slightly different order to make things easier to present).\n\n#### The Encoding Program and its Settings\nLike I said, these two combined are what gets closest to being directly related to the video's \"quality\".\nWhy they matter is probably obvious once one mentions them as a variable:\nOf course different encoding programs can encode a video in different ways, and different settings will make them do it differently.\nBut the real lesson to learn here is that these are even parameters in the first place!\nThis is something that even semi-experienced users sometimes miss (for example, I did so when I was starting out!):\nIt's easy to think that `ffmpeg -i myvideo.mp4 myencodedvideo.mp4` is the only way to reencode a video (maybe sprinkle in `-preset slow` if you're feeling like an expert), without realizing that this will use a fixed (low) quality setting that could be adjusted with further settings.\n\nIn reality, a video coding format like H.264 *only* specifies how an encoded H.264 stream should be *decoded* to pixel data again.\nThe H.264 specification does not specify in *any* way how a video should be \"encoded\" - only the format that the resulting file should use.\nWhat makes formats like H.264 so efficient is that they provide very many different options to specify different kinds of redundant data with very few bits,\nlike \"Fill this entire 16x16 block with white\" or \"copy this 16x16 block from the previous frame and adjust it a little.\"\nBut finding these redundancies, and figuring out how to specify some frame with as few bits as possible in this format,\nwhile losing as little quality as possible, is up to the encoding program.\n\nSo, I really cannot stress enough that the encoding *settings* (including the tool used) matter the most when it comes to a video's quality.\nThis mainly manifests itself in two ways:\n1. The tool used.\n    When it comes to encoding H.264 or H.265, the best encoders without any competition[^x264best] are x264 and x265.\n    When you are in any situation where you can afford it, you should be using one of these encoders.\n    Most video editing programs allow you to select them (and programs like ffmpeg or Handbrake (though ideally you shouldn't use the latter) use them internally).\n  \n    Most importantly, hardware encoders like NVENC aren't useful when targeting quality and efficiency.\n    They aren't as sophisticated as x264/5 and are geared more towards low latency and high throughput.\n    Again, this is very important to realize, and it's the main reason why I am stressing this so much.\n    Hardware encoding certainly has its place in scenarios like streaming\n    where latency matters much more than efficiency or quality,\n    but when your goal is to output a high-quality encode, you shouldn't ever use it.\n\n2. The quality setting.\n    In x264/x265, the main knob to fiddle with to control quality is the setting called CRF (short for Constant Rate Factor).\n    Lower CRF means higher quality (i.e. less quality loss when encoding) at the cost of higher file size.\n  \n  My main point here is not really how to use the CRF setting, but mainly that it exists in the first place,\n  and that it above everything else controls the output quality of your video.[^crfbitrate]\n\n[^x264best]: When targeting quality\n[^crfbitrate]: You may be wondering why I am not mentioning bitrate, which is also a setting in x264/x265. This is because setting x264/x265 to some bitrate will make them *force* the video to that bitrate (when possible), even if it may not be necessary. This will make it waste bits on simple scenes that could be spent on more complex scenes instead. When you are not encoding for live streaming, CRF is the better setting to use, since it will automatically allocate the bits where they're needed most.\n\nThere are lots of other settings in x264/x265 that experts can use to precisely tweak their encodes,\nbut if you don't know what you're doing I'd recommend not touching them at all.\nOnce again, my main point here is really just that **encoding settings affect output quality**.\n\nNow, I said above that these parameters are what gets closest to the video quality, but only in specific scenarios.\nWhy is this?\nWell, what I mean by this is all that the encoding settings can affect is how closely the encoded video resembles the input video,\ni.e. how much quality is lost at the encoding step.\nIf your input video is already bad, then reencoding it with perfect settings will not fix it.\nThis may seem obvious, but it highlights how video quality has multiple different facets.\nSay you are choosing what footage to use as a base for your encode or edit,\nand have the choice between two sources,\nwhere one has a much higher bitrate than the other.\nUsually, you would choose the source with the higher bitrate,\nbut this only makes sense if the two sources were encoded from the same underlying source (or at least similar ones)!\nIt's very possible that the higher-bitrate source had some other destructive processing applied to it (say, sharpening, a bad upscale, lowpassing, etc. - more on these later).\nIn cases like these, you may want to choose the lower-bitrate source instead, if it's at least encoded from a clean base.\n\nSo, as a summary, the *quality loss* of an encode is controlled by the encoding tool and settings,\nbut the quality of an *existing video* is affected by every single step\nthat happened between it being first recorded or rendered and it arriving on your hard drive.\n\n### Interlude: So Then, What is Quality Actually?\nI've now spent a long time talking about what quality *isn't*,\nas well as what quality is *affected* by, so it might be time to try to formulate an actual definition of quality.\n\nWe already got pretty close with our discussion of encoding some video from a given source,\nwith the goal of getting an output that differs from the input as little as possible.\n*That* is what quality is:\nThe quality of an encoded and processed video is a measure of how closely it resembles the source it was created from.\n\nAgain, this sounds extremely obvious once you spell it out, but it has huge consequences that may not be clear to everyone!\nMost importantly, quality can only ever be measured relative to some *reference*,\nsome kind of ground truth.\nWithout a ground truth, everything becomes subjective.\n\nSecondly, this now says something about the \"quality\" of videos you may come across in the wild (i.e. ones that weren't encoded by you):\nWhen you have two or more possible sources for the same footage (say, a movie or a show) available,\nand want to evaluate their quality,\nwhat matters is which of them is closer to the original footage they were both created from.\nIn the case of a movie, this would be the original master.\nOnce again, this may sound obvious, but we will see soon how many misconceptions are formed from not understanding this principle.\n\nFinally, I need to talk about the word \"closer\" in this new definition, which is actually doing a *lot* of heavy lifting.\nWhat \"closer\" really means here is very complicated, which is why I left it somewhat vague on purpose.\nThere are lots of ways to compare how close two videos are\n(and that's assuming that they have the same resolution, frame rate, colors, etc),\nbut none of them are perfect.\nIn particular, there are many automated \"objective\" metrics (you may have heard of PSNR, SSIM, VMAF, etc.).\nThese are very important for encoding programs to function at all,\nbut it's important to realize that [no automated metric is perfect](https://en.wikipedia.org/wiki/Goodhart%27s_law),\nand they all have their own strengths and weaknesses.\n\nBecause of this, video \"quality\" will always entail some degree of subjectivity.\nStill, there are some thing that are almost certainly wrong, and you'll see some of them in the following sections.\n\n### Back to Mythbusting\nYou should now have a decent idea of what quality actually means, and what it's determined by.\nStill, I want to spell out explicitly why various other parameters do not directly correlate to quality,\nand clear up associated misconceptions.\nSo, let's go through them one by one.\n\n#### File Size or Bitrate\nThis should hopefully be clear from the section on encoding settings.\nYes, more bits usually means better encode quality *if* everything else stays the same,\nbut ultimately the full package of encoding settings (which bitrate can be *one* of) is what matters.\nDifferent encoders or settings will result in different efficiency levels,\nso you can have two encodes of the same quality and different file sizes or vice-versa.\nFor example, NVENC allows very fast encoding at the expense of larger file sizes,\nso an x264 encode (with decent settings) will get you much smaller files of the same quality\n(but will of course take much longer).\n\n#### Video Coding Format (H.264 / H.265 / AV1 / etc.)\nAgain, hopefully this should mostly be clear now:\nWhat matters is the *tool* used to encode the video (and its settings), not the format it encodes to.\nA more advanced format will allow for more techniques to efficiently encode a video,\nbut that only matters if the encoding program properly makes use of them.\n\nIn particular, there is an often quoted factoid that \"HEVC is 50% more efficient than AVC\",\nwhich in reality is just plain wrong.\nH.265 (that is, current standard H.265 encoders) does usually provide an efficiency gain over H.264 (that is, current standard H.264 encoders),\nbut if it does it's by far less than 50%.\nAnd, as always, the format is just one facet of the full \"Encoder and settings used\" package.\nOn pirate sites I sometimes see comments like \"I want to download this, but it's AVC. Is there an HEVC version somewhere?\",\nand I hope that I don't have to explain anything further about why that makes no sense.\n\nAnother important point is that the strengths and weaknesses of encoding tooling can greatly differ based on the level of quality you're targeting.\nAV1 is the current new and fancy coding format,\nand modern AV1 encoders (when used correctly) can yield incredible efficiency gains over x264/5 *on low-fidelity encodes*.\nHowever, for high-quality encodes (i.e. targeting visual transparency),\nx264 and x265 are still far ahead.\nIt's for reasons like these that it's very hard to make blanket statements on the efficiencies of different encoders.\n\nOne final thing to mention here is that the coding format *will* affect how difficult it is to *decode* your video.\nOlder or smaller devices may struggle to decode more advanced formats like AV1 or even H.265 (or specific profiles of formats like 10bit H.264).\nThis doesn't directly affect quality, but it may be important to mention for people that plan on making their own encodes:\nIf you're targeting high player compatibility, you may need to keep this in mind and (for example) release an 8bit H.264 version alongside your main release.\n\n#### The File Format (.mkv / .mp4)\nHopefully I don't have to say anything more here.\nRead the first section again if this is not yet clear to you.\nBut I *have* seen \"This is an mp4 file, can someone upload an mkv file instead?\" more than once,\nwhich is why I need to spell this out here.\nIf this was you, look through the later sections to see how to fix these things for yourself.\n\n#### Resolution\nThis may be the biggest misconception of them all:\nMany people effectively think that resolution is the *only* thing that controls a video's quality.\nMaybe it's because of how YouTube and many other streaming platforms expose resolution as the only setting to change \"quality\".\nEither way, this is not the case.\nWe've already seen why this is true in general, but let's go over some specific cases:\n\n- Often, people downscale videos to some lower resolution in order to save file size.\n  For example, if they have some 1080p video that, when run through their encoder,\n  results in a 1GB file while they'd like their file to only be 500MB,\n  they'd try to render it to a 720p video instead.\n  \n  But, as we've seen by now, this is usually not the right way to go about it.\n  If your main goal is to bring down the file size,\n  this can be done much better by adjusting the encoding settings instead:\n  \n  Different parts and scenes of a video will be easier or harder to compress.\n  Scenes without a lot of motion or flat scenes without lots of details or grain will be easier to encode\n  than scenes with lots of moving elements.\n  Encoders know this and are able to intelligently allocate bits where they are most needed,\n  focusing on *visual* quality rather than a uniformly fixed level of precision.\n  \n  By leaving the quality reduction to the encoder instead of downscaling before encoding,\n  the encoder can decide where to save bits,\n  rather than being forced to lose detail *everywhere*.\n  This will often result in a much better-looking result at the same file size.\n  \n  Additionally, encoding downscaled video isn't actually as efficient as one might think,\n  at least not with modern encoding formats:\n  Since all the elements in the video get squished down, there'll be more small details in the same region of space,\n  which makes them harder to encode.\n  \n  Now, if you're targeting *extremely* small file sizes,\n  so that achieving these at (say) 1080p with very low bitrates is impossible without extremely visible artifacts,\n  then you could consider reducing the resolution to make the artifacts more uniform.\n  But resolution definitely shouldn't be the *first* knob you reach for to adjust file size:\n  That should be the CRF or bitrate.\n- Sometimes, some geniuses decide to use that new and fancy AI upscaling software they saw marketed somewhere\n  to \"improve\" some video and upscale it to some higher resolution like 4k,\n  and probably add a bunch of sharpening and whatnot in the process.\n  I could write an entire article about AI upscaling alone\n  (in fact, [I have](https://gist.github.com/arch1t3cht/656c4ccec31af7a3f75555efe157d9e2)),\n  but to keep things short:\n  We've established that *quality* measures how close some video is to the source it originally came from.\n  Applying *any* kind of post-processing[^postprocessing] to the video can only ever take it further away from the source,\n  not closer, and upscaling (AI or not) is no exception.\n  Any kind of \"detail\" you may see the upscale add can only be invented, not added:\n  The upscale fundamentally only has its input to work with, any extra data has to be pulled out of thin air.\n  And no, I don't want to hear that *your* AI upscaling model is actually really good and better than the other ones,\n  so it's actually okay to upscale with it.\n  This is not a question of how good the upscaling process is,\n  it's the process of upscaling itself that's already inherently lossy.\n  \n  There are some extra nuances here (read the linked post for some of them) and AI is not *inherently* bad,\n  but please just trust me as a more experienced person when I tell you that\n  **you should not upscale videos just for the sake of upscaling them**.\n[^postprocessing]: Unless you're *extremely* surgical with it and know *exactly* what you're doing, which the target audience to this post definitely doesn't\n- These lessons about resolution also matter when it comes to *choosing* sources.\n  Once again, *quality* refers to how close a video is to its original source,\n  and it's very much possible for that original source to itself have a low resolution.\n  \n  This is especially relevant for digital anime, which is often produced at some resolution below 1080p.\n  Even in 2025, production resolutions like 720p or 1500x844 are still very common,\n  with the 1080p release being upscaled (usually using conventional methods, not AI) from that.\n  \n  Usually this is not too important to the end user,\n  but it does mean that if you see a new fancy 4k release of some digitally animated show being advertised,\n  the chances are extremely high that this is not *truly* 4k,\n  and instead just upscaled from whatever 1080p master they had before.\n  Note, though, that anime movies or shows that were originally animated on film can be a different story,\n  \n  Similarly, this is very relevant for digitally animated shows that were originally released on DVD.\n  For a good portion of shows from that era, there exist 1080p Blu-Rays that are extremely badly upscaled,\n  so that the DVD will be a much better source.\n  (However, DVDs bring a *ton* of other complications with them,\n  so in the end you should pray that someone else has already done the work of making a proper encode for you.)\n  There are also plenty of shows where this isn't the case,\n  especially if the Blu-ray is a better rescan of a film or if the show has a LaserDisc release,\n  but the general takeaway is that \"higher resolution does not automatically mean better\" also extends to official releases.\n  \nSo, as a summary, keep in mind that resolution is not the same as quality.\nA higher resolution may not mean better quality, and lowering the resolution may not be the best way to save file size.\n\n#### Frame Rate\nThis is fairly similar to the resolution story, so there's not much more to say here.\nJust like AI upscaling just for the sake of upscaling, frame interpolation is bad.\nThere's not even any nuance here this time, just don't do it. (Do I need to spell out what \"quality\" means again?)\nMovies and TV shows are usually 24fps (well, often they're actually 23.976fps[^fracfps], but you get the idea),\nso if you find a source somewhere that has some different frame rate, double-check if that is the correct one.\n\n[^fracfps]: And *that* is actually 24000/1001 fps and *also* constant frame rates are usually a lie anyway but you get the idea.\n\n#### Bit Depth (8bit / 10bit / etc.)\nThis is a tricky one, and I am mainly mentioning it to talk about a very specific technique in encoding.\n\nBit depth is a slightly more niche concept, so I'll explain it just in case:\nBit depth refers to how many color values are possible for each pixel.\nAlmost all images and videos you'll come across are 8bit.\nFor RGB colors, this would mean 256 red/green/blue color values per pixel,\nwhich results in `256 * 256 * 256 = 16777216` total possible RGB color values.\nIn reality, video colors are not *actually* stored in RGB,\nand usually do not exhaust their full available range of values,\nbut for getting a basic intuition this is not too important.\n\nHowever, it's also possible for videos to have a higher bit depth like 10bit or 12bit.\nApart from masters, this is common for HDR video.\n\nIn principle, the same rules as for resolution and frame rate apply:\nDon't change any aspects of your video without a good reason,\nso don't change the bit depth either if you can avoid it.\nThat said, it *is* common in video encoding to actually encode footage at a bit depth higher than the source's.\nThis is due to intricacies of video encoding that are too complicated to explain here,\nbut the upshot is that encoding at a higher bit depth can actually result in an *increase* in efficiency.\nThis is why you may see 10bit encodes of 8bit footage:\nThese do not mean that there was a 10bit source somewhere,\nthey're just encoded in this way because it was more efficient.\n\nThis doesn't contradict our philosophy of not changing anything without good reason,\nit just means that there is a \"good reason\" in this case.\nIn particular, this is feasible here because, unlike with resolution or frame rate, increasing bit depth is not a destructive process (when done correctly)[^scaledestructive].\n\n[^scaledestructive]: Scaling or changing the frame rate can *also* be nondestructive when done correctly, but they're much easier to get wrong than the bit depth.\n\n(If you're interested in *why* encoding at a higher bit depth is more efficient, here's an attempt at a basic explanation:\nIntuitively, you might be confused about this, since adding more bits ought to correspond to more bits to store,\nwhich results in more required file size.\nBut the important thing to realize is that the \"bit depth\" in modern video coding formats is not actually\nwhat controls the level of precision with which pixel values (or, in reality, DCT coefficients) are stored.\nThat level of precision is controlled by the quantization level, which is a different parameter.\n(And that is in fact the main knob that encoders turn to regulate bit rate and quality.)\nInstead, the actual bit depth controls the level of precision at which all mathematical operations\n(like motion prediction and DCTs) are performed when decoding the video, as well as the allowable scale for the quantization level.\nEncoding at a higher bit depth means that operations are performed with more precision,\nwhich makes certain encoding techniques more precise and hence more efficient, which in turn saves space.\nHowever, raising the bit depth also means that slightly more bits need to be spent to encode the actual quantization factor (and other elements),\nso at some point you do get diminishing returns.\nEmpirically it turns out that encoding at 10bit works pretty well for 8bit content, but that encoding at 12bit is not worth it.)\n\n#### The Video's Source (Blu-ray / Web Stream / etc.)\nThis is another slightly tricky one.\n*Usually*, a Blu-ray release of some footage will be better than a web version from the same source,\non account of having a much higher bit rate.\nHowever, this doesn't *always* need to be the case:\nThe fact that various post-processing operations can affect the quality of the video also applies to the authoring stage\n(that is, the process of taking a show or movie's master, and putting it onto a Blu-ray,\nperforming all the necessary conversion and compression that this entails),\nand it is very much possible for a Blu-ray release to have some destructive filtering applied to it that the web releases do not (or for the Blu-ray release to just have terrible encoding settings).\nDifferent web streams from different sites, or different Blu-rays from different authoring companies can be different too.\n\nAgain, this is especially relevant in anime, where some Blu-ray authoring companies apply a blur to the video before encoding it, which hurts quality[^lowpass].\n\n[^lowpass]: *Why* this happens is complicated (and we don't even fully know ourselves). The technical term is *lowpassing*, with the idea being to remove high frequencies in advance in order to improve compressibility, but in practice this is just counterproductive. We suspect that certain proprietary authoring software suites have this lowpassing enabled by default, and that authoring studios aren't aware of it or its negative consequences.\n\nIf you're just starting out in working with video,\nit may be hard to judge for yourself which source is better,\nbut the main thing I want to convey here is that \"Blu-ray\" does not automatically have to mean \"better quality\".\nAlways try to manually evaluate sources using your eyes,\nor ask someone more experienced for advice on which source to pick (see below for some resources on this).\n\n#### HDR vs. SDR\nHDR (High Dynamic Range) is another complicated topic.\nWhat I mainly want to convey here is that, once again, HDR does not *automatically* mean \"better than SDR\".\nIf there are HDR and SDR sources of some footage available,\nit all depends on how they were created, and from what kind of common source (if there is one).\nIt's possible for the SDR version to be a direct tonemap of the HDR one (in which case the HDR version is the objectively better source) or for the HDR version to have been inverse tonemapped from the SDR one (in which case it's the other way around), or for them to have both been created from some base source (in which case it depends on *how*).\nFor example, it is not uncommon for official HDR releases of some footage to never actually reach a brightness above 100 nits,\nand hence be no better than the SDR version.\n\nIn particular, you should be very suspicious of any HDR (or Dolby Vision) source you may find for a video that wasn't officially released in HDR anywhere.\nIt's very much possible that this \"HDR\" version was created artificially from the SDR version by whoever released it,\nin which case (just like an AI upscale) there's no reason to use it over the base SDR version.\n\nAgain, HDR is a very complex topic and these things can be very hard to evaluate as a newcomer, but the important thing is to know that this subtlety exists in the first place.\nIf the SDR version looks decent, you may just want to save yourself (and your viewers, if there are any) the trouble of dealing with HDR and work with the SDR version.\n\n#### Colors\nAs I have already repeated ad nauseam,\nthe goal of video encoding is to change the source as little as possible.\nJust like you shouldn't change the resolution or frame rate without a good reason,\nthe same applies to colors.\nI sometimes see releases where people \"improved the colors :)\",\nand it turns out that what they really did was fiddle with the brightness and saturation sliders until it looked \"better\"\n(read: brighter and more vibrant).[^colorfix]\nBut doing this is the opposite of staying true to the source.\nColor grading is very important for editing photos or raw footage,\nbut when you're working with footage that was *already* edited and mastered by the artists,\nany further \"color corrections\" go against the artistic intent.\n\n[^colorfix]: There *are* some actual types of errors in encoding that affect colors and can be objectively fixed, like double range compression or mistagged color matrices, but those are not the same thing as fiddling with some sliders, and they once again require you to know exactly what you're doing.\n\nIn short, remember that \"brighter and more saturated\" does not mean \"better\".\n\nFinally, while we're on the topic of colors:\nWhen you run an encode, especially from some kind of video editing software,\nmake sure to make a direct comparison of some output frames to the corresponding input frames\n*using good viewing software* (i.e. mpv or vs-preview, see below).\nIf you see a noticeable color mismatch, this may be due to some misconfiguration in your editing software or project\n(like the color matrix or color range)\nthat you will need to look into.\nRead the section on [Color Space Parameters](#color-space-parameters) for more information on this.\n\n#### Sharpness\nLast but definitely not least, we have another one of the bigger misconceptions.\nMany people think that \"sharp\" means \"higher quality\" and, in particular, that \"blurry\" means \"lower quality\".\nWhile it's true that a lower quality encode can manifest itself in more noise around lines,\nand that reducing the resolution (which we've already established you probably shouldn't do) will automatically mean that lines can no longer be as sharp,\nthis is far from a one-to-one correspondence.\n\nIn reality, the exact same thing as for resolutions, frame rates, or colors applies.\nYou want to stay as close to your original video as possible.\nIf some elements of the original video are comparatively blurry, chances are that they're *meant* to be blurry.\n(Or, at the very least, any kind of sharpening process will not be able to distinguish\nbetween elements that are meant to be blurry and ones that aren't.)\n\nHence, just like you shouldn't fiddle with color sliders just to \"improve the colors\",\nyou shouldn't slap a sharpening filter on top of your video just to \"make it sharper :)\".\nThis will only take your video further away from the source, not closer.[^sharpening]\n\n[^sharpening]: Once again, some caveats apply here in specific cases. For example, if you absolutely cannot avoid upscaling your video, you might as well find a \"good\" way (whatever that means) to upscale it, and try to add as little blurring as possible. But sharpening just for the sake of sharpening is not a good idea.\n\nIt's true that to the layman viewer's eye, sharper content will look more appealing.\nBut once you know what to look for, you will see that sharpening creates a lot of ugly artifacts like\nline warping or [haloing](https://en.wikipedia.org/wiki/Ringing_artifacts).\nLike with upscaling, please just take my word for it when I tell you that **prioritizing sharpness above all else is not a good idea**.\n\n### Summary\nNow, that was a lot of text, but unfortunately it was needed.\nVideo is very, *very* complicated, and this was just the tip of the tip of the iceberg.\nIn case that was too much information to dump on you all at once, let me summarize the most important takeaways:\n\n- You cannot judge a video's quality just by looking at its resolution and file size.\n- If in any way possible, use x264 or x265 to encode your video.\n  Use the CRF setting to adjust quality vs. file size instead of jumping directly to downscaling.\n- You should not change any aspect of your video unless you know exactly what you're doing\n  (and the target audience of this post does not).\n  This affects resolution, frame rate, colors, sharpness, and any other postprocessing filters you might think of applying.\n\n### Learning to Spot Quality Loss\nAs a novice video encoder, it may be hard to see quality loss in the beginning.\nYou may come across images or comparisons where some experienced encoder says \"Oh my god this looks terrible!!\" while you're thinking \"Are those the same picture?\".\n\nBut don't worry, this is normal.\nYou have to know what to look for in an image, and you have to train your eyes to look for it.\n(But know that this is cursed knowledge. Once you learn how to spot artifacts, you can never look at video the same again.)\nA full guide on how to spot video artifacts would take up an entire second article with many example images,\nbut as a short summary, here is a list of areas you should focus on most:\n\n- Dark areas, especially dark gradients\n- Strong colors, in particular black edges on deep and dark reds\n- Areas with lots of (static or dynamic) grain or texture\n- The spaces *around* sharp lines and edges.\n  Don't look at the edges themselves, instead look for noise *next* to them.\n  In particular, look for bright \"halos\" around edges (also called [ringing](https://en.wikipedia.org/wiki/Ringing_artifacts))\n- In particular, look for noise next to sharp full-resolution elements like on-screen text\n- Image borders\n\nKeep in mind that what constitutes acceptable quality loss is always in the eye of the beholder,\nand that that is a two-way street.\nIf you are creating encodes mainly for yourself, and you yourself cannot see any quality loss, then there's no reason to worry about it even if someone else tells you it's visible.\nHowever, on the other hand, you also shouldn't criticize anyone for releasing high file size encodes to prevent quality loss just because *you* can't see the artifacts they would prevent.\n\n## Color Space Parameters\nThis section is a bit more advanced than the other sections.\nYou can skip it on your first time reading, if you prefer, or only read the summary at the end of the section.\n\nThere are some things that can go wrong when modifying a video that you should be aware of.\nThey shouldn't happen with the workflows described below,\nbut if you use a different workflow and/or add extra steps, you may run into these.\n\nI have mentioned above how videos are (usually) not actually stored as RGB,\nand this is where this becomes relevant.\nThe colors of video frames are usually stored in a different color space called YCbCr.\nHere, the \"Y\" part is called \"luma\" and more or less represents a pixel's \"brightness\",\nwhile the \"Cb\" and \"Cr\" parts are called \"chroma\" and represent the color tone.\nThis goes all the way back to how analog color television had to be built in a way that is backwards-compatible with black-and-white television,\nbut we're stuck with it now.\n\nMoreover, the chroma part of the video is often stored at half the resolution (in both directions) of the luma part.\nThis is called \"chroma subsampling\".\nFor example, a typical 1920x1080 video would be stored as `1920*1080` luma values and two sets of `960*540` chroma values.\nWhen playing the decoded video, your media player first needs to upscale the two 960x540 chroma \"planes\" to 1920x1080.\nOnce again, this mostly has historical reasons nowadays.\n\nCheck this [comparison](https://slow.pics/c/FHxrfYwm?image-fit=none) for an example of how an RGB image splits into Y/Cb/Cr parts.\n\nNow, why do you need to know this?\nWell, the problem is that there is *more than one way* to convert an RGB[^rgb] image to YCbCr and back.\nThe most relevant of these two are called BT.601 and BT.709.\nBoth of these specify a \"matrix\" (if you don't know linear algebra, just think of it as a \"formula\") that can be used to compute a YCbCr value from an RGB value and vice-versa.\nHowever, converting a given YCbCr value to RGB via BT.601 will give a different RGB color than BT.709!\n\n[^rgb]: In case that you have heard about how there are also multiple \"RGB\" color spaces before (if you have not, just ignore this footnote),\nI should clarify that I do not mean any specific RGB color space here (so in particular not sRGB).\nFor the purposes of this section it's enough to interpret the term \"RGB colors\" as just \"a set of three color values that each specify *some* sort of R/G/B component\", without worrying about, say, primaries or transfer characteristics.\n\nOut of these two color matrices, BT.601 is the older one.\nIt was used for old SD-era content like DVDs.\nBT.709, on the other hand, is the newer matrix.\nThe vast majority of videos you will run into (say: movies and TV shows that were produced in 720p or higher, screen captures, most YouTube videos[^ytmatrix], etc.) should be BT.709.\nOne notable exception here is HDR video, but dealing with that properly is an entirely different beast of its own and is outside the scope of this post.\n\n[^ytmatrix]: You can actually see the color matrix a YouTube video uses by right-clicking the video and clicking \"Stats for Nerds\".\n\nHowever, some old software (including ffmpeg!) may still default to using BT.601 in certain cases even when BT.709 should be used instead.\nThis is why it is important to know about this distinction.\n\nNow, a media player playing back a video file needs to *know* which color matrix (BT.601 or BT.709) a video uses.\nFor a \"well-behaved\" video, this is specified in the video's metadata, but unfortunately many videos *aren't* this well-behaved.\n(And many video editing or encoding programs do not always store this information in the metadata of the videos they output.)\n\nIf the media player does not know what color matrix a video file uses, it has to guess based on the information it has, like e.g. the video's resolution.\nFor example, some video players may guess that an untagged (i.e. with no specified matrix) video uses BT.601 if its height is less than, say, 600 (suggesting that this is an SD era video).\n\nThis can already cause some suprising behavior!\nLet's say you have an untagged 1080p video, which you then downscale to 480p (which I have explained above may be a bad idea, but that's beside the point here).\nIf you open the 1080p video and the 480p video side by side in such a media player,\nthe two videos will be displayed with different colors!\n\nIf you were seeing this without knowing what's going on, you might be led to conclude that the video encoder is somehow changing the colors,\nbut the reality is that the change in resolution is causing the media player to guess a different color matrix.\nHence, you can solve this by tagging both your input and your output video as BT.709.\n(If you tag the input video as BT.709, most encoders should should also copy this to your output video.)\n\nFor reasons like these, this topic can cause great headaches, but the upshot is:\n**If you see your reencode somehow changing colors, check your input and output video's color matrices.**\nIf your source video is untagged, it can be hard to figure out what color matrix it *should* have,\nbut when in doubt you can at least make sure that your output has the same matrix as your input\n(or, if your input is untagged, the same matrix as what a player like mpv would guess).\nOr, of course, you can ask someone more experienced for advice.\n\nYou can check a file's color matrix in MediaInfo (see the section below for the tools mentioned here),\nand you can edit it in MKVToolNix.\nTo see what color matrix mpv assumes on your video, open your video in mpv, press `i`, and check the `Colormatrix: ` field under the \"Video\" section (*not* the \"Display\" section).\n\n### Other Parameters\nUnfortunately, the story does not end with the color matrix.\nThe matrix is just one of multiple parameters that specify how to interpret a video's colors.\nThe full list of parameters is:\n\n- The color matrix (BT.709 / BT.601 / BT.2020 / etc.). Note that BT.601 is sometimes also called SMPTE170M or BT.470BG.\n- The color range (Limited or Full)\n- The transfer characteristics, or gamma (BT.709 / BT.601 / BT.1886 / sRGB / PQ / HLG / etc.)[^transfer]\n- The primaries (BT.709 / BT.601 / BT.2020 / etc.)\n- The chroma location (Left or Center Left / Center / Top Left / etc.)\n\n[^transfer]: There is a distinction between OETF and EOTF to be made here (as well as subtleties like an OETF of BT.709 often being interpreted as an EOTF of BT.1886, as well as sRGB sometimes being displayed as 2.2 gamma or BT.1886 instead or vice-versa, and so on... God, gamma is a mess), but all of that is out of scope of this article.\nI'm only listing all of these so people can recognize the abbreviations when they see them in the wild.\n\nYes, codes like \"BT.709\" can refer to *multiple* of these parameters.\nBut this doesn't cause much confusion in my experience, since most programs make a clear distinction between matrix, transfer, and primaries.\nIf you hear someone talk about \"a BT.709 video\" (without specifying whether they mean matrix, transfer, or primaries),\nthey probably mean the matrix.\n(A video with a BT.709 matrix does not need to automatically have BT.709 transfer and primaries (it could have sRGB instead, for example), but if you see a video tagged with a mix of BT.709 and BT.601 that usually means that something went wrong *somewhere*.)\n\nAll of these five parameters are values that are necessary to properly interpret a video's colors,\nand that *should* be provided as metadata but often aren't.\nLuckily, if you're editing a video, you generally only need to worry about the matrix, range, and chroma location - the other two (transfer and primaries) are less likely to break.\n\nWe have already talked about the matrix, so let's continue with the range.\nThe vast majority of videos you're dealing with will have a \"Limited\" range.\nThe color range should also not break when reencoding video with, say, ffmpeg, but I *have* seen issues with color range when working with video editors like Vegas.\nWhen you see your output colors being much darker or much brighter than your input, check the color range settings of your editing program.\n\nFinally, there is the chroma location.\nThis is the most subtle of all of these, since it's harder to notice issues with it,\nbut it may also be the one with the highest risk of breaking - especially when working with video editing programs.\n\nWhen applying chroma subsampling (which, as you hopefully remember, is the process of scaling down the chroma by a factor of two in each direction), there are multiple ways to perform the downscale process.\nUltimately, what you need to do is turn each 2x2 group of four chroma values into a single value.\nOne way to do this would be to simply take the top left pixel of each 2x2 group.\nAlternatively, one could average all four pixel values together and obtain a new value that could be interpreted as the interpolated \"middle\" value of the 2x2 square.\nThese two methods result in different alignments of the chroma sample grid relative to the video's pixel grid:\nA chroma value could either \"come from\" the top left of a 2x2 square of luma values, or from the \"middle\" (or from some other location).\nThis is called the \"chroma location\", and it is yet another parameter that media players need to know to play a file back properly:\nWhen scaling the chroma back to the luma's full resolution, the scaling process needs to factor in the chroma location in order to not introduce a shift relative to the original chroma.\n\nFor reasons that I have not yet been able to uncover, the standard chroma location for most videos is actually \"center left\",\ni.e. in the middle between the left two samples of the four luma samples.\nHence, video players (or at least good players like mpv) will default to \"center left\" chroma when a video's chroma location is not tagged.\nHowever, a lot of video processing software does not handle chroma location correctly, e.g. often using a chroma location of \"center\" instead.\nBecause of this, depending on your workflow, editing a video may sometimes introduce a \"chroma shift\" where the chroma ends up shifted by some small amount relative to the luma.\nThis can be hard to spot to a novice video encoder, but it manifests in colored lines consistently having a slightly different color on one side of the line than on the other side.\nAs before, chroma location issues can usually be fixed by tagging your videos and/or configuring your tools correctly (and by asking a more experienced person if necessary).\n\n**In summary**: Depending on your workflow, processing your video may have a risk of breaking your video's color matrix, color range, or chroma location. (And, moreover, it is possible for your sources to have broken or mistagged matrices, ranges, or chroma locations. But if you are starting out it may be better not to worry about that part, and to just ensure that you aren't introducing any *additional* problems involving these parameters.) These issues can manifest as follows:\n- Wrong (usually mistagged or untagged) color matrices:\n  Generally manifests in colors slightly changing in hue and intensity.\n  See [this comparison](https://slow.pics/c/ILKT9eMG) for an example.\n  This can usually be fixed by tagging your input correctly.\n- Wrong color range:\n  This will manifest in colors becoming much less saturated or more saturated throughout.\n  See [this comparison](https://slow.pics/c/wMsTpPsP) for an example.\n  Color range issues should be fairly rare for normal reencodes, but can happen when using a misconfigured video editor.\n- Chroma shift:\n  See [this comparison](https://slow.pics/c/HUDn94wt) for an example.\n  This one can be hard to spot for beginners.\n  The easiest way to spot a shift with a given reference is to (zoom in a lot and) look at areas with very strong colors.\n  Spotting a chroma shift without a reference is even harder,\n  but in this comparison you can see that diagonal lines on the \"chroma shift\" image have a slight reddish glow on the right side and a blue-ish glow on the left side.\n  When this glow is consistent throughout the entire video, that may be indicative of a chroma shift.\n  (Make sure to not confuse an intentionally added chromatic abberation effect for a chroma shift, though.[^ca])\n  How to fix a chroma shift can depend a lot based on your workflow (so figuring out where it is introduced should be your first step), but tagging your input video and checking your editing software's configuration will definitely help.\n  If you cannot figure out how to configure your software to avoid a chroma shift, you can try retagging your output.\n\n[^ca]: If you're wondering how to distinguish between these (apart from asking someone more experienced): A digital chromatic abberation effect is usually created by shifting the RGB planes relative to each other, while a chroma shift is a shift of, well, the chroma plane relative to the luma plane. Chromatic abberation will result in differently colored glow around edges and probably cannot be fixed by retagging the chroma location.\n\nThis section was far more technical than the rest, but the good news is that you do not need to understand all of it right away.\nThe important parts to remember are that processing a video can cause some issues related to color spaces.\nWhen trying out a new workflow, it may be a good idea to compare an output frame to the corresponding input frame and look for any color mismatches that look like the ones in the examples above.\nIf you don't see any, you can forget everything you read in this section.\nIf you do find something, this section may help you to fix it.\n\n## Subtitles\nWhen you're working on an anime or some other media that is not in your target audience's language,\nyou will need to add subtitles, in which case there are a couple of things you should know.\n\nThe most powerful format for subtitles is Advanced SubStation Alpha, or ASS for short[^ass].\nASS subtitles not only allow showing subtitles for spoken dialogue\nbut also creating translations for on-screen text that blend in seamlessly with the original video.\nEven if you do not plan to make subtitles like these themselves,\nyou probably want to ship subtitles you downloaded from somewhere, which will probably be in the ASS format.\n\n[^ass]: Yeah, the jokes never get old.\n\nOne important thing to know is that the only container format that really supports ASS subtitles is mkv.\nIf, for some reason (probably because you're targeting some kind of streaming),\nyou do not want to release an mkv file in the end,\nyou will need to hardsub.\nSee below for the best way to do this.\n\nSecondly, if your goal is to edit your video,\nyou will have to think about how to match your subtitles to your edit.\nThere is no good automated solution here.\nYour options are basically:\n1. Manually retime the subtitles in a program like Aegisub, or\n2. Hardsub the subtitles and edit the hardsubbed video.\n\nIn general, you should avoid hardsubbing when possible, since it\n- involves reencoding, and hence introduces quality loss,\n- takes time (which may not be a problem when you are only editing your video once,\n  but becomes increasingly annoying if you want to make incremental fixes later on),\n- makes it much harder for anyone, including yourself, to change some aspect of the subtitles later on.\n\nHowever, retiming all subtitles yourself for a quick edit is also a lot of effort.\nIn the end, the choice is yours.\nIf you do end up hardsubbing, make sure you do it correctly.\nRead the later sections for how.\n\n## Recommended Tools and Workflows\nI've now talked a lot about what you *shouldn't* do, so what should you do instead?\nThis section contains some useful tools, as well as workflows to do certain things the right way.\n\n### Recommended Tools\n- [MediaInfo](https://mediaarea.net/en/MediaInfo): If you install one tool from this list, you should install this one, which is why it's listed at the top.\n  Step 0 in anything to do with video is finding out what exactly you are working with, and MediaInfo will tell you exactly that.\n  Open a file in MediaInfo and switch the view to \"Text\" at the top to see all important data.\n  If you ever need help from someone more experienced with video, sending them a proper MediaInfo dump of your file is a great way to get them into a good mood.\n- [mpv](https://mpv.io) is the single best media player out there, and ideally you should use it.\n  MPC-HC (if you get the actual [latest version](https://github.com/clsid2/mpc-hc/releases)) is alright too,\n  but mpv is the definite best.\n  In particular, VLC is not recommended.\n  \n  Apart from simply watching the video, mpv can also make screenshots and encode videos for you.\n  The latter in particular is very helpful for hardsubbing.\n- [ffmpeg](https://ffmpeg.org) is your Swiss army knife for everything to do with video,\n  from inspecting to encoding and remuxing.\n  (Though you should know that it's usually not actually ffmpeg itself that's doing the encoding.\n  When you encode to H.264/H.265 with ffmpeg, ffmpeg is actually calling x264/x265 internally.\n  I'm mainly bringing this up because it bothers me how everyone praises ffmpeg for being good at encoding when it's really x264/5,\n  but this also means that you should check the x264/5 documentation if you need help with encoding, not ffmpeg's.)\n  \n    Note that ffmpeg is kind of a jack of all trades, master of none.\n    It can do a lot of things fairly well,\n    but for specific tasks there are often specialized tools available that do it even better.\n  \n    FFmpeg is a *command-line tool*.\n    If you have never used a command-line tool before, read [this page](https://developer.mozilla.org/en-US/docs/Learn_web_development/Getting_started/Environment_setup/Command_line) for a quick primer.\n    \n    Before you start complaining about how complicated ffmpeg is and how arcane its syntax is,\n    do yourself a favor and read the start of [its documentation](https://ffmpeg.org/ffmpeg.html).\n    It turns out that reading the (f.) manual actually helps a lot!\n- Use [SlowPics](https://slow.pics) if you want to share image comparisons.\n  There are also ways to [automatically upload comparisons to SlowPics](https://jaded-encoding-thaumaturgy.github.io/JET-guide/master/misc/comparison/) using [vs-preview](https://github.com/Jaded-Encoding-Thaumaturgy/vs-preview/), but those are a bit more involved.\n  \n    When looking at a SlowPics comparison, uncheck \"Show border\" and \"Smooth scaling\" at the bottom and use the clicker comparison rather than a slider. Use the number keys (1/2/3/etc) to switch between images.\n- [MKVToolNix](https://mkvtoolnix.download/downloads.html#windows) for muxing mkv files.\n  You can do this with ffmpeg too, but MKVToolNix has a GUI if you need one (and is better in certain ways).\n- [MKVExtractGUI](https://www.videohelp.com/software/MKVExtractGUI-2) or [MKVcleaver](https://www.videohelp.com/software/MKVcleaver) to extract tracks from mkv files (or learn how to do it with ffmpeg).\n- [Aegisub](https://aegisub.org) to edit subtitles.\n  Note that you can also simply open `.srt` or `.ass` subtitles in a text editor like Notepad\n  if you need to quickly check something,\n  but if you want to do actual editing or timing you should use a proper tool like Aegisub.\n- [MkvToMp4](https://www.videohelp.com/software/MkvToMp4) for remuxing an `.mkv` file to an `.mp4` with a proper constant frame rate (see below).\n\n### Tools You Should *Not* Use\n- Avoid using Handbrake if possible.\n  Handbrake has a lot of footguns like suddenly changing the frame rate or adding interlacing.\n  I would recommend you to just learn basic ffmpeg usage instead.\n  If you must use a GUI, try [Staxrip](https://github.com/staxrip/staxrip).\n- Don't use any file conversion websites.\n  Those all just use ffmpeg under the hood anyway,\n  so you'd be better off just spending the 10 minutes to learn how to use ffmpeg directly.\n  Hopefully I don't need to tell you that you shouldn't fall for any [30$ x264/ffmpeg wrappers](https://compresto.app) either.\n- Don't use Topaz AI, Anime4k, RealESRGAN, RIFE, etc. Trust me, just don't.\n- Don't use imgsli for image comparisons, it (lossily) converts its images to JPEG which invalidates any comparisons.\n  Use SlowPics (linked above), and try not to use the slider feature there either.\n  You can spot a lot more differences by switching the full images back and forth than with a slider.\n\n### Workflows\nFinally, let me explain a few things you *should* do.\n\nIf you've read the previous sections, you'll know that reencoding a video will hurt its quality\n(and reencoding it over and over will hurt its quality even more, since later encodes will spend bits to reproduce the artifacts introduced in the previous encodes).\nHence, you should make sure that you only reencode when absolutely necessary,\nand do all other necessary conversions through remuxing.\nIdeally, that would mean (lossily) reencoding only once, at the very end of your workflow.\nIf your editing software does not allow encoding using x264/x265, you can export a lossless render from it and then encode that lossless render with ffmpeg.\n\nSometimes, you cannot easily avoid reencoding an additional time at some other step in your workflow.\nIf this happens, at least make sure that your intermediary encodes are either lossless, or as close to lossless as possible.\nUnfortunately, I have not yet found a reliable way to encode a lossless file that common editing programs can open\n(if you know one, let me know!),\nbut at the very least you can make an x264 encode with `-crf 1`.\n\nTo make an *actually* lossless encode with `x264` you can add `-qp 0` instead of a `-crf` argument,\nbut be aware that not all programs will be able to open such a file.\n\n#### Encoding a Video\nThe simplest way to encode a video is using ffmpeg.\nMore advanced users will encode using x264 or x265 directly, but ffmpeg is fine for beginners.\n\nA basic template command to reencode a video is simply[^encodetemplate]\n```\nffmpeg -i yourinput.mkv -c copy -c:v x264 -preset slower -crf 20 youroutput.mkv\n```\nAs explained in the first section, adjust the CRF to control the quality at the expense of file size.\nIf you're encoding anime or animation, you may want to bump up the bframes by adding `-x264-params bframes=8`\n(which will save a bit of file size but take longer to encode).\nOther than that, **do not touch any other settings you do not understand**.\nIn particular, do not use `-tune animation` for anime; that tune is targeted towards extremely flat animation,\nso it will be counterproductive on anime, which usually has a fair amount of grain and texture.\n\n[^encodetemplate]: This is a starting point for the target audience of this article. Experienced encoders targeting transparency will use [very different settings](https://jaded-encoding-thaumaturgy.github.io/JET-guide/master/encoding/x264/).\n\nA good way to think of video encoding is as a three-way tradeoff between file size, quality, and encoding speed.\nYou can decrease the file size, but only at the expense of quality or encoding speed,\nand similarly for the other two factors.\nThe `crf` setting is used to regulate between quality and (decrease in) file size.\nThe `preset` setting controls the encoding speed, and hence the efficiency.\nA faster preset will mean a faster encode, but also a larger and lower-quality one.\n\nBe aware that the visual quality of a given CRF value will depend on the resolution you're encoding at.\nCRF 18 at 1080p behaves differently from CRF 18 at 480p.\nThe best way to pick a CRF for your encode is just to run a few sample encodes and compare the results.\n\n#### Muxing an MKV\nHopefully you can figure this out with the tools linked above (MKVToolNix being the easiest way).\nAll I really want to say here is that if you are muxing in ASS subtitles,\nyou need to add all the fonts used in the subtitles as attachments.\nAegisub has a font collector that can collect all the fonts used in a file.\nIf you don't want to install the fonts, you can use a font manager like [FontBase](https://fontba.se) (add the folder with all fonts as a \"watched folder\") to temporarily activate them without installing them.\n\n#### Remuxing to MP4\nThis is more tricky and the main reason why this section exists.\nIn principle, muxing an mp4 file is easy: Just run `ffmpeg -i yourinput.mkv -c copy youroutput.mp4`.\nHowever, chances are that the reason you are remuxing to an mp4 file is so that you can import your video into your favorite video editing program.\nIn that case, remuxing using ffmpeg can cause some problems with the frame rate.\n\nMost videos you'll come across have a constant fractional framerate of 24000/1001 (which is approximately 23.976) frames per second.\nBut this is actually a bit of a lie: A lot of times the frame rates aren't *truly* constant\n(and, in fact, in mkv files they often *cannot* be).\nFor certain technical reasons, frame timestamps often need to be rounded,\nwhich causes ever-so-slight deviations from the constant 24000/1001 frames per second.\nVideo players handle this completely fine, so that you'd never even notice it as a normal (or even experienced) user.\nHowever, some video editing programs can be extremely picky about these frame rates,\nand introduce stuttering when the frame rate is not *truly* constant.\n\nSince mkv files fundamentally cannot have a true constant frame rate of 24000/1001,\nremuxing to mp4 using ffmpeg will also result in a frame rate that is not truly constant.\nYou can see this in MediaInfo in the `Frame rate mode` entry.[^mediainfoframerate]\n\n[^mediainfoframerate]: Though I'm not fully sure if MediaInfo is completely reliable here. The best way to know for sure is to use [MP4 Inspector](https://sourceforge.net/projects/mp4-inspector/) and check the `moov > trak > mdia > minf > stbl > stts` box. If it is truly CFR, there should only be a single entry.\n\nThere exist a couple of ways to fix this:\n1. [MkvToMp4](https://www.videohelp.com/software/MkvToMp4) is a GUI application that can remux an mkv to an mp4 file\n   and force a constant frame rate if applicable.\n   While I haven't audited it in detail myself,\n   I know video editors who have used it for a long time and haven't had issues with it.\n   Note that it only supports H.264, not H.265, though.\n2. With the right incantation, you can also force a constant frame rate in ffmpeg.\n   The best one I could come up with needs two invocations, though:\n   ```\n   ffmpeg -i yourinput.mkv -c copy -video_track_timescale 24000 intermediary.mp4\n   ffmpeg -i intermediary.mp4 -c copy -bsf:v \"setts=dts=1001*round(DTS/1001):pts=1001*round(PTS/1001)\" out.mp4\n   ```\n   If your source video is, say, 30000/1001 fps instead of 24000/1001,\n   replace the 24000 in the first call with the appropriate numerator.\n   \n   There also exists a tool called [mp4fpsmod](https://github.com/nu774/mp4fpsmod) that can force mp4 frame rates,\n   but I found the ffmpeg call to be more reliable when the first frame does not start at timestamp 0.\n\nFurthermore, you may also want to add `-movflags +faststart` to your command when muxing an mp4 file.\nThis will make the muxer put all the file's metadata at the start of the file rather than at the end,\nso that even partial downloads of the file can be played back.\nWith the above two commands, this would look as follows:\n```\nffmpeg -i yourinput.mkv -c copy -video_track_timescale 24000 intermediary.mp4\nffmpeg -i intermediary.mp4 -c copy -bsf:v \"setts=dts=1001*round(DTS/1001):pts=1001*round(PTS/1001)\" -movflags +faststart out.mp4\n```\n\n#### Hardsubbing\nUse mpv to hardsub:\n```\nmpv --no-config yourinput.mkv -o youroutput.mkv --audio=no --ovc=libx264 --ovcopts=preset=slower,crf=20,bframes=8\n```\nAdjust the encoding settings accordingly, of course.\n\nThis will hardsub the track marked as the default, add e.g. `--sid=0` or `--slang=eng` to select a different track.\nHardsubbing is an extra encoding step, and like explained above you want to reencode as few times as possible.\nHence, either make sure that hardsubbing happens at the end of the workflow from a lossless source,\nor output a (near) lossless encode when hardsubbing (e.g. by setting the CRF to 1).\n\n## Bonus: Interlacing\nThis is a bonus section meant to prevent some slightly more advanced misconceptions.\nIf you don't know what the term \"interlacing\" means, you can safely skip this section.\n\nIf you *do* know what interlacing means,\nthe main thing I want to get across here is that not all interlacing is the same,\nand in particular that the answer to seeing footage that looks \"interlaced\" is not always to run a deinterlacer.\n\nWhen working with movies and TV shows,\nit is actually much more likely for interlaced-looking footage to really be *telecined*.[^telecine].\nWhat this means exactly is outside the scope of this article,\nbut you can read [fieldbased.media](https://fieldbased.media) or [the Wobbly guide](https://wobbly.encode.moe/gettingstarted/primer.html) for more information.\nThe important takeaway is that telecining can (almost) be losslessly reversed (though it may need manual processing),\nand that running a deinterlacer on telecined footage will throw away half the vertical resolution while still keeping the frame rate stutters.\nWhen you see footage that shows combing, please consult some more experienced person before blindly running a deinterlacer on it.\n\n[^telecine]: This is a fairly established term in the encoding community, but it's actually somewhat incorrect. Outside of the encoding community, you'll usually see this being referred to as *3:2 pulldown* instead.\n\n## The Rabbit Hole\nThe above should cover everything you need to know as a beginner.\nIf you like to suffer and are interested in learning more about multimedia and encoding,\nthe  [JET Guide](https://jaded-encoding-thaumaturgy.github.io/JET-guide/master/) can be a good place to start.\nIn particular, it contains a big [list of resources](https://jaded-encoding-thaumaturgy.github.io/JET-guide/master/resources/) that link other good guides. [My blog](https://arch1t3cht.org/) also has some more posts about video stuff.", "url": "https://wpnews.pro/news/what-you-need-to-know-before-touching-a-video-file", "canonical_source": "https://gist.github.com/arch1t3cht/b5b9552633567fa7658deee5aec60453", "published_at": "2025-05-22 20:23:49+00:00", "updated_at": "2026-05-23 02:03:04.118714+00:00", "lang": "en", "topics": [], "entities": ["Handbrake"], "alternates": {"html": "https://wpnews.pro/news/what-you-need-to-know-before-touching-a-video-file", "markdown": "https://wpnews.pro/news/what-you-need-to-know-before-touching-a-video-file.md", "text": "https://wpnews.pro/news/what-you-need-to-know-before-touching-a-video-file.txt", "jsonld": "https://wpnews.pro/news/what-you-need-to-know-before-touching-a-video-file.jsonld"}}