PDF embeds the actual font program inside the file — every glyph the document needs ships with it. Word stores a font name and trusts the recipient’s machine to find a matching font at render time.…
A PDF holds an image as an object: bytes, metadata, a transformation matrix that places it on the page. Word holds an image as either an inline run inside a paragraph or a floating object anchored …
Some failures in PDF→Word conversion are bugs that better engineering would fix. The rest are structural: the information Word needs is missing from the source, or the two formats describe incompat…
Run the same PDF through three converters and you will get three broken Word files, each broken in a different way. One assembles tables but scatters the columns. Another keeps columns but folds ev…
A table in a PDF is a stack of horizontal and vertical lines, plus text fragments at coordinates that happen to fall inside the rectangles those lines suggest. The PDF file contains no notion of ro…