Podcast: ๐‡๐จ๐ฐ ๐ƒ๐ž๐ž๐ฉ๐ฌ๐ž๐ž๐ค ๐ฅ๐ž๐ฏ๐ž๐ซ๐š๐ ๐ž๐ ‘๐Š๐ง๐จ๐ฐ๐ฅ๐ž๐๐ ๐ž ๐ƒ๐ข๐ฌ๐ญ๐ข๐ฅ๐ฅ๐š๐ญ๐ข๐จ๐ง’ ๐ญ๐จ ๐จ๐ฎ๐ญ๐ฌ๐ฆ๐š๐ซ๐ญ ๐Ž๐ฉ๐ž๐ง๐€๐ˆ

Website: https://debabratapruseth.com/

There has been a wave of speculation and awe surrounding how Deepseek, a relatively new player, managed to challenge industry giants like OpenAI with its competitive LLM modelsโ€”achieving comparable performance in a fraction of the time and cost.
One term that frequently comes up in this discussion is “๐‘ฒ๐’๐’๐’˜๐’๐’†๐’…๐’ˆ๐’† ๐‘ซ๐’Š๐’”๐’•๐’Š๐’๐’๐’‚๐’•๐’Š๐’๐’.” In fact, Microsoft, OpenAI, and even the U.S. government are reportedly investigating whether Deepseek leveraged this technique in ways that might raise ethical or legal concerns.

๐–๐ก๐š๐ญ ๐ข๐ฌ ๐Š๐ง๐จ๐ฐ๐ฅ๐ž๐๐ ๐ž ๐ƒ๐ข๐ฌ๐ญ๐ข๐ฅ๐ฅ๐š๐ญ๐ข๐จ๐ง?

Knowledge distillation is a machine learning technique where a smaller, more efficient model (the ๐ฌ๐ญ๐ฎ๐๐ž๐ง๐ญ) learns from a larger, highly trained model (the ๐ญ๐ž๐š๐œ๐ก๐ž๐ซ), rather than directly from raw data. This allows the student model to retain much of the teacher’s intelligence while being significantly more compact and resource-efficient.

๐‡๐จ๐ฐ ๐๐จ๐ž๐ฌ ๐ข๐ญ ๐ฐ๐จ๐ซ๐ค?

1) ๐‘ป๐’“๐’‚๐’Š๐’ ๐’‚ ๐‘ฉ๐’Š๐’ˆ ๐‘ป๐’†๐’‚๐’„๐’‰๐’†๐’“ ๐‘ด๐’๐’…๐’†๐’
– A very powerful AI model learns from a huge dataset.
– This model becomes super accurate but is too big and slow

2) ๐‘ป๐’‰๐’† ๐‘ป๐’†๐’‚๐’„๐’‰๐’†๐’“ ๐‘ฎ๐’Š๐’—๐’†๐’” “๐‘บ๐’๐’‡๐’•” ๐‘จ๐’๐’”๐’˜๐’†๐’“๐’”
– Instead of just saying “This is a cat” or “This is a dog,” the teacher gives confidence scores. Example: “Iโ€™m 85% sure itโ€™s a cat and 5% sure itโ€™s a dog”
-These soft answers help the student model understand relationships between different classes.

3) ๐‘ป๐’“๐’‚๐’Š๐’ ๐’‚ ๐‘บ๐’Ž๐’‚๐’๐’๐’†๐’“ ๐‘บ๐’•๐’–๐’…๐’†๐’๐’• ๐‘ด๐’๐’…๐’†๐’
– The student model is trained using both ๐’“๐’†๐’‚๐’ ๐’…๐’‚๐’•๐’‚ and the teacherโ€™s ๐’”๐’๐’‡๐’• ๐’‚๐’๐’”๐’˜๐’†๐’“๐’”.
– The student learns to mimic the teacher but with fewer resources.

4) ๐‘ต๐’๐’˜ ๐‘พ๐’† ๐‘ฏ๐’‚๐’—๐’† ๐’‚ ๐‘ญ๐’‚๐’”๐’•๐’†๐’“, ๐‘บ๐’Ž๐’‚๐’๐’๐’†๐’“ ๐‘จ๐‘ฐ!
– The student model is much smaller and faster
– Even though itโ€™s not as big as the teacher, it performs almost as well.

๐ƒ๐ข๐ ๐ƒ๐ž๐ž๐ฉ๐ฌ๐ž๐ž๐ค ๐”๐ฌ๐ž ๐Ž๐ฉ๐ž๐ง๐€๐ˆ ๐š๐ฌ ๐ˆ๐ญ๐ฌ “๐“๐ž๐š๐œ๐ก๐ž๐ซ”?

In this case, the suspected ๐’•๐’†๐’‚๐’„๐’‰๐’†๐’“ ๐’Š๐’” ๐‘ถ๐’‘๐’†๐’๐‘จ๐‘ฐ, and the ๐’”๐’•๐’–๐’…๐’†๐’๐’• ๐’Š๐’” ๐‘ซ๐’†๐’†๐’‘๐’”๐’†๐’†๐’Œ. This raises an important question: Did OpenAI directly assist Deepseek? The answer is noโ€”but OpenAIโ€™s models and datasets are accessible via APIs that can be purchased. What regulators and industry experts are now scrutinizing is whether Deepseek systematically leveraged OpenAIโ€™s APIs for large-scale knowledge distillation.

In the coming days, we may gain clarity on whether Deepseekโ€™s approach violated ethical AI development guidelinesโ€”or if it simply found a clever way to compete with the biggest names in AI.

Until then, enjoy both Deepseek and OpenAI.


Discover more from Debabrata Pruseth

Subscribe to get the latest posts sent to your email.

Scroll to Top