Think of ChatGPT as a blurry jpeg of all the text on the Web. It retains much of the information on the Web, in the same way that a jpeg retains much of the information of a higher-resolution image, but, if you’re looking for an exact sequence of bits, you won’t find it; all you will ever get is an approximation. But, because the approximation is presented in the form of grammatical text, which ChatGPT excels at creating, it’s usually acceptable. You’re still looking at a blurry jpeg, but the blurriness occurs in a way that doesn’t make the picture as a whole look less sharp. This analogy makes even more sense when we remember that a common technique used by lossy compression algorithms is interpolation—that is, estimating what’s missing by looking at what’s on either side of the gap. When an image program is displaying a photo and has to reconstruct a pixel that was lost during the compression process, it looks at the nearby pixels and calculates the average. This is what ChatGPT does when it’s prompted to describe, say, losing a sock in the dryer using the style of the Declaration of Independence: it is taking two points in “lexical space” and generating the text that would occupy the location between them. source..
Chatgpt is a blurry Jpeg of the web Student’s name Institutional Affiliation CHATGPT IS A BLURRY JPEG OF THE WEB Large language models used today search through enormous amounts of data, most of which comes from the web. In contrast to a trustworthy responder, we shall compare OpenAI's ChatGPT in our summary article to a "blurry JPEG" of the internet. For instance, file compression calls for encoding—or converting—text into a smaller version, and then decoding it, or performing the opposite operation. If the original and decoded files are the same, the compression is "lossless." Based on the massive amounts of data it collects and compiles to deliver answers, language models like OpenAI's ChatGPT approximate grammatical text. The responses from GPT can be typically accepted by using statistical regularities. However, because the algorithms used to generate the answers are lossy, they reflect a "blurry JPEG of all text on the web." It is frequently necessary to contrast AI-fabricated responses with authentic materials in order to detect these inaccuracies. Such errors are expected if compression has removed 99% of the training data from a language model's database. Lossy compression algorithms sometimes employ "interpolation," whereby the computer determines which information from a piece of text is missing by observing what occurs on either side of it (Sinclair, 2023). It's comparable to an AI recreating pixels to display a compressed image by figuring out how neighboring pixels will look. People find this type of interpolation amusing since ChatGPT is so adept at it: They've discovered a "blur" option for paragraphs rather than photographs and are having a blast using it. For instance, if a text file contains a million math questions, educating the AI on arithmetic principles and then generating calculator code will most likely yield the optimum compression ratio. Large portions of text can be compressed using the same reasoning. For instance, when a language computer searches through a wealth of information regarding "supply and demand," it decides which terms to discard as it goes. The GPT-3 big language model can correctly add or subtract two integers, but when given tasks involving larger numbers, its accuracy gradually declines. It’s inability to "carry the one" when completing math is one of its flaws. Some claim that the statistical patterns found in text match those found in real-world knowledge. A more straightforward argument would be that the very fact that ChatGPT is not a lossless algorithmic system lends it a more human appearance. It would function like a glorified search engine if it cont...
