The Digital Battlefield

Big Tech Bias, Algorithmic Manipulation, and Data Sovereignty

How Big Tech platforms, AI systems, and algorithmic manipulation constitute a new form of civilizational warfare. From Google Gemini's biased treatment of Hinduism to TikTok's radicalization pipelines, this lesson maps the digital mechanisms that mediate India's self-understanding through foreign infrastructure.

See It Today: When the Machine Learns Your Bias

In February 2024, users began testing Google's new AI model, Gemini, with prompts about world religions. The results were striking in their asymmetry.

Ask Gemini to generate an image of Jesus, and it politely declined, citing respect for religious sensitivities. Ask it to generate an image of Prophet Muhammad, and it declined for the same reason. Ask it to generate an image of Lord Shiva or Goddess Kali, and it complied, but the outputs were bizarre, sometimes depicting Hindu deities with incorrect iconography, wrong number of arms, or Western fantasy aesthetics that stripped the sacred imagery of its meaning.

Young Indian student at a desk staring at a laptop showing a biased AI chatbot response.

The text responses revealed a deeper pattern. Queries about Christianity or Islam produced respectful, contextual answers acknowledging theological complexity. Queries about Hinduism consistently foregrounded caste, patriarchy, and social hierarchy. Not as one dimension among many, but as the defining frame. The civilization that produced Panini's grammar, Aryabhata's mathematics, and Nagarjuna's logic was reduced to a caste problem with some festivals attached.

This wasn't a glitch. It was a mirror. Gemini's training data drew heavily from English-language academic sources, and as earlier lessons in this course have documented, English-language Indology carries specific biases baked in over two centuries. The AI didn't invent the bias. It scaled it. What took a university department decades to normalize, an AI model could inject into millions of conversations per day.

OpenAI's ChatGPT displayed similar patterns. Ask it to explain "the caste system" and you'd get a confident, detailed answer. Ask it to explain "jati-varna complexity" or "social mobility in pre-colonial India" and the responses became vague, hedging, or defaulted back to the oppression narrative.

The digital battlefield isn't coming. It's already here. And the weapons aren't tanks or missiles. They're algorithms, training datasets, and the invisible hand that decides what 1.4 billion people see, think, and believe about themselves.

The Mechanism: Algorithms, Data, and Digital Colonialism

To understand digital colonialism, you need to understand three layers: the platform layer, the algorithm layer, and the data layer. Each operates as a mechanism of control, and together they form an architecture of influence more powerful than any colonial administration.

The Platform Layer: Who Owns the Town Square?

India has over 800 million internet users. The platforms they spend their time on are almost entirely American or, until recently, Chinese. Google controls search. Meta controls social networking. YouTube controls video. X (formerly Twitter) controls real-time discourse. Amazon and Apple control app distribution.

This means the rules governing what 800 million Indians can say, see, share, and sell are written in boardrooms in San Francisco and Menlo Park. Content moderation policies are designed for American cultural contexts and then applied globally. What counts as "hate speech," what gets labeled "misinformation," what gets amplified and what gets suppressed: these decisions are made by people with no understanding of Indian civilizational context and no accountability to Indian citizens.

The structural problem is jurisdiction without sovereignty. These platforms operate in India, profit from Indian users, and shape Indian public discourse. But they answer to American regulators, American shareholders, and American cultural norms.

The Algorithm Layer: The Invisible Editor

If the previous lesson exposed the visible editors in newsrooms and bureaucracies, this lesson exposes the invisible one: the algorithm.

Social media algorithms optimize for engagement. Engagement is driven by emotional arousal. The content that generates the most clicks, shares, and comments is content that makes you angry, afraid, or outraged. This creates a structural bias toward division.

For India's civilizational discourse, this means content that frames Hinduism as oppressive gets amplified because it generates outrage from both sides. Content that explores the nuance of dharmic philosophy gets buried because nuance doesn't generate clicks. The algorithm doesn't care about truth or civilizational continuity. It cares about engagement metrics. And division is more engaging than unity.

The algorithmic suppression of Indic content is documented. Creators producing content about Sanskrit, Vedic mathematics, temple architecture, or dharmic philosophy consistently report lower reach, demonetization, and reduced recommendations compared to content that critiques Hindu practices. YouTube's recommendation engine will lead a viewer from a video about Diwali to a video about "caste violence" in three clicks, but rarely lead from a video about caste to one about Bhakti movement reformers.

The Data Layer: Who Owns India's Mind?

Every search query, every like, every share, every private message, every location check-in: this data flows to servers controlled by foreign corporations. India generates one of the world's largest data streams. The insights extracted from this data, about consumer behavior, political sentiment, cultural trends, religious practice, shape how these platforms treat Indian users.

This is the data sovereignty problem. India's digital population is the product, not the customer. Indian users generate the data. American corporations extract the value. The behavioral models built from Indian data are used to sell products to Indians, influence Indian elections, and shape Indian cultural attitudes. All without Indian oversight.

The TikTok episode crystallized this. By June 2020, TikTok had over 200 million users in India. ByteDance, its Chinese parent company, had access to location data, contact lists, browsing patterns, and behavioral profiles of 200 million Indians. India's intelligence agencies flagged the national security implications: a foreign government, through a single app, had more granular data on Indian citizens than India's own census.

But TikTok's impact went beyond data harvesting. Its algorithm, optimized for addictive short-form content, was creating radicalization pipelines. Communally provocative content spread faster than educational content. Conspiracy theories outperformed factual analysis. The algorithm learned that India's fault lines were engagement goldmines and mined them relentlessly.

India's decision to ban TikTok and 58 other Chinese apps in June 2020 was one of the first major acts of digital sovereignty by any democracy. The United States, United Kingdom, and European Union followed years later with their own TikTok restrictions. India's move was dismissed as "authoritarian" by the same Western commentators who later advocated identical measures for their own countries.

The AI Layer: Bias at Scale

The newest front in the digital battlefield is artificial intelligence. Large Language Models (LLMs) like GPT, Gemini, and Claude are trained on datasets that are overwhelmingly English-language and Western-sourced. The Indology embedded in these models reflects the academic establishment documented in Lesson 04_01: Wendy Doniger's interpretive frameworks, Sheldon Pollock's political readings of Sanskrit, and two centuries of colonial epistemology.

When these models are asked about Indian civilization, they don't consult Abhinavagupta or Shankaracharya. They consult the Western academic corpus that interpreted (and often misinterpreted) Abhinavagupta and Shankaracharya. The result is AI systems that reproduce colonial knowledge frameworks at unprecedented scale.

AI-generated deepfakes add another dimension. During Hindu festivals, fabricated images and videos circulate on social media: doctored footage of "animal sacrifice" during Navratri, fake videos of "forced conversion" during Holi celebrations, manipulated images of temple demolitions. These deepfakes are designed to provoke outrage, both within Hindu communities and against them. The speed of AI generation outpaces any fact-checking infrastructure.

The Pattern: The Printing Press Was the First Algorithm

The digital battlefield has a precise colonial precedent: the printing press.

A Jesuit colonial printing press in 1556 Goa pressing fresh pages

The first printing press in India arrived in 1556 with Jesuit missionaries in Goa. For over two centuries, printing in India was predominantly a missionary enterprise. The technology that could have preserved and disseminated Indian knowledge systems was instead used to produce Christian theological texts, anti-Hindu polemical literature, and colonial administrative documents.

This wasn't accidental. Control of the printing press meant control of what knowledge was produced, reproduced, and distributed. The missionaries understood that technology is never neutral. The printing press was the algorithm of its era, and those who controlled it controlled the narrative.

When Indians began establishing their own presses in the late 18th and early 19th centuries, the colonial response was swift. Raja Ram Mohan Roy's Mirat-ul-Akhbar, one of India's first independent newspapers, operated under constant government scrutiny. The Vernacular Press Act of 1878 (often called the Lytton Press Act) gave the colonial government power to shut down any Indian-language publication deemed "seditious." English-language publications faced no such restrictions.

The pattern is identical to today's digital landscape. The technology is controlled by external powers. Indian voices using that technology face differential treatment. And the justification is always framed in neutral language: "sedition" then, "community guidelines" and "misinformation" now.

James Augustus Hickey, who founded the Bengal Gazette in 1780, was jailed and his press seized when he criticized the East India Company. Today, accounts get suspended and content gets demonetized. The mechanism has been digitized, but the power dynamic is unchanged: those who control the information infrastructure control the civilization's self-understanding.

The colonial government also understood data sovereignty avant la lettre. The Great Trigonometrical Survey of India (1802-1871) mapped every inch of the subcontinent. This data wasn't gathered for Indian benefit. It was gathered for administrative control, resource extraction, and military advantage. The modern equivalent is the behavioral data harvested by Big Tech: gathered from Indians, processed abroad, used for purposes Indians don't control.

The lesson from history is clear. Every new information technology becomes a battleground for civilizational control. The civilization that controls the technology shapes the narrative. The civilization that merely uses it gets shaped.

Dharmic Wisdom: Vidya as Sovereignty

The Arthashastra recognizes information as the foundation of sovereignty. Kautilya devoted entire sections to intelligence networks, counter-intelligence, and information management. Not because he was paranoid, but because he understood a fundamental truth: a kingdom that cannot control its information environment cannot control anything else.

The concept of Vidya (knowledge) in the dharmic tradition goes beyond "education" or "knowledge." Vidya is the capacity to see clearly, to distinguish truth from illusion. Its opposite, Avidya, is not ignorance in the simple sense. It is the condition of being made to see wrongly. When an AI system trains an entire generation to view their civilization through a distorted lens, that is Avidya engineered at industrial scale.

The Mundaka Upanishad distinguishes between Para Vidya (higher knowledge, knowledge of the Self and the Ultimate) and Apara Vidya (lower knowledge, knowledge of the material world). Both are necessary. But when a civilization loses control over its Para Vidya, when its deepest self-understanding is mediated by foreign algorithms and training data, it loses something no amount of material progress can restore.

Chanakya's principle of Sva-Tantra (self-governance, literally "one's own system") applies directly to the digital realm. A nation that runs on someone else's platforms, stores its data on someone else's servers, and allows someone else's algorithms to shape its public discourse is not truly Sva-Tantra. It has achieved political independence while surrendering informational sovereignty.

The Mahabharata offers a powerful metaphor. In the game of dice, the Pandavas lost everything not through military defeat but through a rigged game whose rules they didn't control. The digital platforms of today are the dice game. The rules are written by others. The house always wins. And India keeps playing.

The Defense: Digital Sva-Rajya

The digital battlefield demands a comprehensive defense strategy. Not censorship or isolation, but the creation of sovereign digital infrastructure and the cultivation of digital civilizational literacy.

An Indian engineer building sovereign digital infrastructure in Bengaluru

Build Sovereign Infrastructure. India has already demonstrated it can do this. UPI (Unified Payments Interface) proved that India can build digital infrastructure that rivals and surpasses Silicon Valley products. Over 10 billion transactions per month flow through a system designed in India, governed by Indian rules, and serving Indian interests. The Aadhaar identity platform, whatever its privacy controversies, demonstrated that India can build population-scale digital systems. The Co-WIN vaccination platform managed the world's largest vaccination drive digitally.

These successes need to be extended into the information domain. India needs indigenous search engines that understand Indic languages and civilizational context. It needs social media platforms where content moderation reflects Indian, not American, cultural norms. It needs AI models trained on Indian knowledge systems alongside Western ones: models that can discuss Panini with the same fluency as Chomsky, that understand Dharmashastra as well as they understand constitutional law.

The Indian government's push for data localization, requiring that Indian data be stored on Indian servers, is a necessary first step. Digital India's cloud infrastructure initiatives and the development of IndiaAI (the national AI mission) show strategic awareness. But infrastructure alone is not enough without the knowledge systems to populate it.

Cultivate Digital Civilizational Literacy. Every Indian internet user needs to understand three things: how algorithms shape what they see, who controls the platforms they use, and what happens to the data they generate. This isn't technical literacy alone. It's civilizational literacy applied to the digital domain.

When a young Indian sees Gemini describe Hinduism primarily through the caste lens, they should recognize the colonial academic lineage behind that framing. When they notice that their YouTube recommendations lead from devotional content to "debunking" content, they should understand the engagement-optimization logic driving that pipeline. When they see a deepfake video of a festival incident, they should have the tools to verify before they react.

This literacy must be built into education, from school curricula to university programs to public awareness campaigns. Digital civilizational literacy is not optional. It is a survival skill.

Support Indic AI Development. The AI models that will shape the next generation's understanding of every civilization on earth are being built right now. If Indian knowledge systems are not represented in these models' training data, they will be represented through Western academic proxies, with all the biases that entails.

Indian technologists, scholars, and institutions need to collaborate on building training datasets that include primary Indian sources: the Vedas, Upanishads, Arthashastra, Thirukkural, Sangam literature, Buddhist Pali texts, Jain Agamas, and the vast corpus of Indian scientific, mathematical, and philosophical writing. Not as curiosities, but as legitimate knowledge systems with analytical frameworks as rigorous as any Western equivalent.

Projects like AI4Bharat (at IIT Madras) working on Indian language AI models, the National Digital Library of India archiving Indian texts, and Sanskriti AI initiatives developing culturally aware language models represent the right direction. These efforts need both government support and community investment.

Exercise Consumer Sovereignty. Every click is a vote. Every download is a choice. Indians who understand the stakes can make conscious decisions about which platforms they use, which apps they trust with their data, and which content they engage with. Supporting Indian-built alternatives, demanding Indic language support, and rewarding platforms that treat Indian civilization with the same respect they show others: these are acts of digital self-defense that require no legislation or institutional power. Just awareness and will.

The Arthashastra teaches that the wise king doesn't just build walls. He builds his own roads, his own markets, his own intelligence networks. In the digital age, the wise civilization doesn't just regulate foreign platforms. It builds its own.

Case studies

Google Gemini's Civilizational Blind Spot

In February 2024, users testing Google's Gemini AI discovered systematic asymmetry in how it treated world religions. When asked about Christianity or Islam, Gemini produced respectful, contextually rich answers acknowledging theological complexity. When asked about Hinduism, responses consistently foregrounded caste hierarchy, patriarchal structures, and social oppression. Image generation requests for Hindu deities were handled with incorrect iconography (wrong number of arms, Western fantasy aesthetics), while requests for Abrahamic religious figures were declined on grounds of 'religious sensitivity.' The AI's training data drew heavily from English-language academic sources shaped by two centuries of colonial Indology, producing bias at unprecedented scale.

Patanjali's Yoga Sutras identify Viparyaya (false knowledge) as more dangerous than simple ignorance. Ignorance can be corrected by introducing knowledge. Viparyaya resists correction because the subject believes they already know. Gemini's outputs are Viparyaya industrialized: confident, fluent, internally consistent descriptions of Indian civilization that do not correspond to reality but feel authoritative. The Nyaya school's insistence on validating knowledge through multiple Pramanas (means of knowledge) is precisely what AI systems lack. They rely on a single Pramana (the training corpus) and that corpus carries embedded biases.

Google issued corrections and acknowledged bias in Gemini's outputs. But the structural problem remains: AI training data is overwhelmingly sourced from English-language Western academic institutions. Without deliberate inclusion of Indian primary sources and indigenous scholarly perspectives, every major AI model will reproduce colonial knowledge frameworks at scale. The corrections addressed symptoms, not the underlying data architecture.

Technology amplifies the worldview of those who build it. When AI systems are trained on datasets that reflect two centuries of colonial scholarship, they don't produce neutral knowledge. They produce colonial knowledge at machine speed. The solution is not better filters but better training data.

As AI becomes the primary knowledge interface for billions of users, the biases embedded in training data will shape how entire generations understand every civilization on earth. The Gemini incident is not an isolated bug but a preview of epistemic colonialism at AI scale.

Large Language Models are trained on datasets where English-language content constitutes over 90% of the corpus, while Indian-language content represents less than 1%, despite India having over 1.4 billion people and 22 officially recognized languages.

The Colonial Printing Press: The First Algorithm

The first printing press arrived in India in 1556 with Jesuit missionaries in Goa. For over two centuries, printing technology in India was predominantly controlled by missionary and colonial enterprises. The technology that could have preserved and disseminated India's vast knowledge systems was instead used to produce Christian theological texts, anti-Hindu polemical literature, and colonial administrative documents. When Indians began establishing their own presses in the late 18th century, the colonial response was swift. Raja Ram Mohan Roy's publications faced constant scrutiny. The Vernacular Press Act of 1878 empowered the colonial government to shut down any Indian-language publication deemed seditious, while English-language publications faced no equivalent restriction.

Kautilya recognized that information infrastructure is never neutral. In the Arthashastra, he treated control of communication networks as a core function of sovereignty, not a peripheral concern. The colonial printing press monopoly was a textbook case of what Kautilya warned about: when you allow external powers to control your knowledge infrastructure, they will use it to reshape your civilization's self-understanding. The differential treatment of vernacular vs. English press mirrors today's algorithmic suppression: the language of the colonizer gets amplified while indigenous voices face additional barriers.

Indian-owned presses eventually broke the colonial monopoly, producing the newspapers and literature that fueled the independence movement. But the pattern left a lasting legacy: English-language media retained a prestige and reach that vernacular media never fully matched. The structural advantage established in the colonial printing era persists in the digital age, where English-language content dominates AI training data and search algorithms.

Every new information technology becomes a battleground for civilizational control. The printing press, the telegraph, radio, television, and now digital platforms follow the same pattern: those who control the infrastructure control the narrative. The civilization that merely uses technology built by others gets defined by others.

Today's content moderation policies ('community guidelines') function like the Vernacular Press Act: neutral language masking differential treatment. Indian content creators face shadowbanning and demonetization for discussing Hindu traditions, while content critiquing those same traditions faces no equivalent barriers.

The Vernacular Press Act of 1878 led to the suppression of over 40 Indian-language publications in its first two years, while not a single English-language publication was affected by the law.

India's TikTok Ban: Digital Sovereignty in Action

By June 2020, TikTok had amassed over 200 million users in India, making it ByteDance's largest market outside China. Indian intelligence agencies flagged that the app was harvesting location data, contact lists, browsing patterns, and behavioral profiles of 200 million citizens, with all data flowing to servers accessible to the Chinese government under China's National Intelligence Law. Beyond data harvesting, TikTok's algorithm, optimized for addictive short-form content, was creating radicalization pipelines. Communally provocative content spread faster than educational content. The algorithm learned that India's civilizational fault lines were engagement goldmines and exploited them. On June 29, 2020, India banned TikTok along with 58 other Chinese apps, citing national security and data sovereignty concerns.

The Arthashastra's concept of Kuta Yuddha (covert warfare) applies directly. Kautilya understood that the most effective attacks come disguised as something harmless or even beneficial. A free entertainment app that 200 million people voluntarily install is the perfect Kuta Yuddha weapon: the target population downloads the surveillance tool themselves, provides their own data willingly, and becomes addicted to the very mechanism that is profiling them. India's ban was an exercise of what Kautilya called Danda Niti (the science of governance through decisive action) applied to the digital domain.

India's TikTok ban was initially criticized as 'authoritarian' by Western commentators. Within three years, the United States, United Kingdom, European Union, and multiple other nations enacted their own TikTok restrictions, citing identical national security concerns. India's domestic alternatives (Josh, Moj, Instagram Reels) captured much of the market. The ban demonstrated that digital sovereignty is achievable without economic collapse, and that first-mover action on tech sovereignty creates global precedent.

Digital sovereignty requires the political will to act before the threat becomes irreversible. India proved that a democracy can make hard choices about foreign technology platforms without descending into authoritarianism. The nations that criticized India's decision eventually adopted the same position.

The TikTok precedent established that data sovereignty is a legitimate national security concern. But the same logic applies to American platforms: Google, Meta, and Amazon harvest Indian data at comparable scale. The question is whether India will extend the sovereignty principle beyond Chinese apps to all foreign platforms that treat Indian data as a resource to be extracted.

India's TikTok ban preceded the US TikTok ban by nearly four years. In that period, at least 15 other nations implemented their own restrictions on the app, many citing India's action as precedent.

Reflection

More in Institutional Capture & Internal Decay

All lessons in Institutional Capture & Internal Decay ยท Unbreaking India course