Paper Link: https://arxiv.org/abs/2405.09818

When you upload an image to ChatGPT and ask a question in text those inputs are converted to numbers before they can