Foundations of Advanced Machine Intelligence — A Complete 10-Class Curriculum
This iGentixAI curriculum explores the frontier of Artificial Intelligence — transitioning from basic pattern recognition into the complex realms of abstract reasoning, combinatorial creativity, and the architectural quest for Artificial General Intelligence. Students will examine recursive memory, functionalist emotion models, and relational reasoning as frameworks to simulate human-like cognition. By moving beyond monolithic structures toward modular, interconnected "societies of mind," this course outlines the theoretical and practical foundations necessary for the next generation of autonomous discovery.
A foundational orientation to the iGentixAI curriculum. We establish the current state of AI, its extraordinary capabilities, and — crucially — where it still falls fundamentally short. This class maps the intelligence spectrum from rule-based systems to today's LLMs, and frames the questions that will drive every subsequent lesson.
Modern AI systems like Large Language Models are trained on hundreds of billions of text tokens and can produce code, prose, art, and music at remarkable quality. Yet they still operate within learned statistical patterns — not genuine understanding. They predict the next token, not the next idea. This gap is the heart of our journey.
From rule-based expert systems of the 1970s to today's transformer-based neural networks, intelligence in machines has been a long, discontinuous progression. We map this timeline — ELIZA (1966), Deep Blue (1997), AlexNet (2012), GPT-4 (2023) — and identify where the next fundamental leap, AGI, may emerge and what it would require.
Introduction to the curriculum's core pillars: (1) Creativity, (2) World Models, (3) Recursive Memory, (4) Infrastructure, (5) AGI Theory, (6) Functionalist Emotion, (7) Emergence, (8) Modular Architecture, and (9) Relational Reasoning — each a necessary building block toward machine general intelligence.
ChatGPT drafting a business plan demonstrates narrow generative AI at impressive scale. AlphaFold predicting 200 million protein structures demonstrates AI solving specialized problems that baffled scientists for 50 years. Neither "understands" — they pattern-match at extraordinary scale. This course asks: what would genuine understanding actually require, and are we building toward it?
How do machines create? This class analyzes the capacity of LLMs to synthesize massive datasets and remix existing ideas into novel, unifying principles — from combinatorial mixing to the frontier of truly transformational creativity that changes the rules of a domain.
Like a molecular chef combining unexpected ingredients, AI systems explore the combinatorial space of all known concepts. In drug discovery, AI proposes novel molecular structures by recombining known compound families — yielding candidates no human chemist would intuitively try. The key insight: AI's creativity is non-intuitive precisely because it lacks human cognitive biases and cultural constraints.
The harder frontier: can AI change the rules of a domain rather than remix within them? This would mean proposing a new mathematical framework, inventing a genuinely new musical scale system, or generating a paradigm-shifting scientific theory — not just applying known ones. No AI system has definitively achieved this yet, but it is the active research horizon.
Systems trained on scientific literature can propose "what if" questions across chemistry, biology, and physics — accelerating ideation cycles from years to hours. This transforms the scientist's role from explorer to evaluator, curating AI-generated hypotheses rather than generating all ideas manually.
Insilico Medicine used AI to design a novel drug molecule for idiopathic pulmonary fibrosis in 46 days — a process that typically takes 4-5 years in traditional research. The AI explored combinatorial molecular design space, then ranked candidates by predicted binding affinity. Creativity at machine speed, validated in clinical trials.
iGentixAI systems build internal "world models" — compressed simulations of their environment — enabling them to test and innovate within virtual spaces before acting in reality. We examine how this capability supercharges combinatorial creativity and accelerates learning.
A world model is an AI's internal compressed representation of how its environment works. Rather than reacting to each input fresh, the system predicts what comes next, simulates consequences, and plans ahead — much like a chess player visualizing future board states before moving a piece. The quality of the world model determines the quality of the agent's decisions.
By running millions of virtual experiments in simulated environments, AI can test combinations that would be dangerous, expensive, or physically impossible. Autonomous vehicle companies run billions of simulated driving hours to teach edge-case handling — experiences no fleet of real test cars could accumulate in decades.
Human creativity is constrained by cognitive bias, cultural context, and lived experience. World-model-enabled AI explores non-intuitive regions of concept space — finding solutions humans would never naturally consider because they lie outside familiar conceptual territory.
DeepMind's AlphaZero learned chess, shogi, and Go by playing against itself in simulated worlds — developing strategies no human player had conceived in centuries of competition. Evaluating 80,000 positions per second in its internal world model, it discovered moves human experts initially called "mistakes" before proving them strategically brilliant.
Memory is the scaffold of intelligence. This class explores how recursive memory mechanisms allow AI systems to iteratively re-access, refine, and build upon their own internal states — enabling coherent long-horizon planning, multi-step reasoning, and self-correction.
Unlike a simple feedforward pass through a neural network, recursive memory allows the system to "think again" — feeding its own output back as new input. This enables dynamic, evolving representations that grow richer with each cycle, like a writer revising a draft multiple times. Each pass can catch errors, add nuance, or restructure the argument that earlier passes produced.
Recursive memory allows systems to maintain a coherent model of the world across many steps — essential for tasks like multi-step mathematical reasoning, long-form writing, or robotic planning. Without it, each reasoning step is disconnected from the last — a fundamental limitation of early language models.
When an AI's internal model predicts an outcome that conflicts with incoming data or logical constraints, recursive memory flags the discrepancy. The system backtracks, revises its hypothesis, and refines its understanding — a fundamental property of adaptive, rather than reactive, intelligence.
OpenAI's o1 and o3 model series use "chain-of-thought" reasoning — a form of recursive processing where the model drafts internal reasoning steps, critiques them, and refines the answer before outputting. This approach elevated performance on competitive mathematics olympiad problems from ~13% to over 90% — a transformation attributable almost entirely to recursive self-refinement.
Intelligence at scale requires extraordinary physical infrastructure. We examine the specialized silicon, software ecosystems, and massive capital flows that make advanced AI possible — and how hardware constraints define the ceiling of what AI can achieve.
Training GPT-4 reportedly required ~25,000 NVIDIA A100 GPUs running for months. General-purpose CPUs are fundamentally insufficient for the parallel matrix mathematics at the heart of neural networks. GPUs, TPUs (Google's custom chips), Groq's LPUs, and next-generation custom ASICs represent a new era of purpose-built compute for AI.
NVIDIA's CUDA platform became the de facto operating layer for AI research — a decade-long moat. Its ecosystem of optimized libraries lets researchers write high-level code that efficiently exploits GPU parallelism. Open-source alternatives like ROCm (AMD) and XLA (Google) are now mounting serious challenges to CUDA's dominance.
The "scaling hypothesis" proposes that intelligence predictably emerges from simply training larger models on more data with more compute. Validated repeatedly across five years of LLM development, it has driven hundreds of billions in infrastructure investment — datacenters, power grids, cooling systems, and dedicated fiber networks.
Microsoft's $10B investment in OpenAI and the subsequent "Stargate" $100B AI infrastructure initiative illustrate how AI capability is now inseparable from physical capital. A new nuclear-powered data center in Pennsylvania will dedicate its entire electrical output to AI compute — intelligence powered by atomic energy.
What would it actually take to build a genuinely general intelligence? This class explores emergent intelligence, "as relations" — novel conceptual frameworks imposed on raw data — and the theoretical architectures that point toward AGI as a reachable destination.
Just as wetness is not a property of individual water molecules yet emerges from their collective interaction, intelligence may emerge from the interaction of many non-intelligent computational components. Studying emergence means studying relationships between things, not just things themselves — a fundamental reorientation of how we think about AI design.
Humans invent conceptual lenses — gravity, entropy, evolution — that reframe raw observations into meaningful patterns. AGI may require the ability to autonomously invent such frameworks: seeing data "as" a network, "as" a game, or "as" a competition — and fluidly switching lenses as contexts shift. This is the deepest unsolved problem in AGI research.
Shifting from a single monolithic model toward a "society" of specialized sub-agents — each expert in its domain — that collectively reason at a level none could achieve alone. This mirrors the brain's use of specialized regions for vision, language, emotion, and motor control, orchestrated by executive function.
Ant colonies exhibit extraordinary collective intelligence — building climate-controlled nests, farming fungi, waging organized warfare — with no central coordinator. Each ant follows local, simple rules. The colony's remarkable behavior emerges entirely from their interaction. This biological blueprint directly inspires distributed AGI architectures.
Can machines have emotions — and should they? Not human feelings, but functional analogs: internal control signals that regulate computation, prioritize goals, and drive adaptive behavior. We explore NARS as a case study in operational emotion modeling.
Functionalist emotions are not subjective feelings but internal control signals serving the same computational role as emotions in biological organisms: directing attention, modulating effort allocation, flagging surprising outcomes, and signaling when to abandon unproductive computational paths. They are information, not experience.
The Non-Axiomatic Reasoning System assigns priority weights to tasks based on relevance to current goals — analogous to desire. When predictions match reality, "satisfaction" reinforces that reasoning line. Unexpected outcomes trigger "surprise," redirecting computational resources to update the world model. The system adaptively allocates finite compute based on emotional-functional signals.
"Frustration" in NARS identifies logical impasses — situations where continued computation on a problem yields diminishing returns. Rather than wasting cycles indefinitely, the system uses frustration as a signal to abandon a strategy and explore alternative approaches. This is computational wisdom — knowing when to stop.
Reinforcement learning agents trained on Atari games exhibit proto-emotional behavior without any explicit emotion programming: they persist on strategies that historically yielded reward and abandon approaches with repeated failure. This emergent persistence and abandonment is functionally equivalent to desire and frustration — arising from the structure of reward-based learning itself.
Intelligence does not live in a single node — it lives in the spaces between nodes. This class examines how complex reasoning, creativity, and problem-solving emerge from the coordinated interaction of many simple, specialized components at multiple scales.
Inside transformer models, hundreds of "attention heads" each learn to track different relationships in data — one head might track grammatical subject-verb agreement, another coreference across paragraphs, another syntactic structure. Their combined output produces rich, contextual understanding that no single head possesses alone. This is emergence within a single model.
In multi-agent AI frameworks, individual models are assigned specific roles — researcher, critic, code executor, verifier, synthesizer — and their structured interaction produces outputs superior to any single agent. Debate between agents surfaces reasoning errors; collaboration between specialists produces solutions beyond any individual's reach.
Designing systems where intelligence emerges requires different engineering principles: modularity (clean interfaces between components that can be developed independently), open-ended learning (no hard-coded task limits — the system can acquire new capabilities), and scalability (adding more agents measurably improves collective performance).
AutoGen (Microsoft) and CrewAI deploy teams of LLM agents collaborating on complex tasks: one writes code, another reviews it for bugs, a third executes and reports errors, a fourth synthesizes results into documentation. Complex software engineering projects that stump single models are solved through structured agent collaboration — emergence through interaction.
The arc of AI architecture bends toward modularity. This class details the shift from unified, single-model approaches to distributed systems — examining the Society of Mind theory, hierarchical reinforcement learning, and swarm intelligence as blueprints for next-generation AI.
Marvin Minsky's "Society of Mind" theory (1986) proposes that human intelligence arises from the interaction of hundreds of small, specialized mental "agents" — none of which is intelligent alone. This theory is now being operationalized in AI through modular architectures where specialized sub-networks handle distinct cognitive functions, coordinated by meta-cognitive orchestrators.
HRL systems decompose complex tasks into hierarchies: high-level "manager" agents set abstract goals (navigate to room B; write a program that sorts data), while low-level "worker" agents execute concrete actions (open door; write a specific function). This mirrors how human executive function coordinates physical actions — enabling long-horizon planning through goal decomposition.
Like a murmuration of starlings producing complex collective flight patterns from simple local rules, AI swarms coordinate to solve problems no individual agent could. GPT-4 reportedly uses a "Mixture of Experts" architecture: only relevant expert sub-networks activate per token — achieving 8-model-equivalent capacity at a fraction of the inference cost.
Boston Dynamics' robots use hierarchical control architectures: high-level planners determine locomotion strategy (walk, climb, jump, recover from falls) while low-level controllers manage millisecond joint-level actuation. Neither layer alone produces the robot's extraordinary agility — it emerges entirely from their structured interaction across the hierarchy.
The capstone class synthesizes the entire iGentixAI framework. AGI's power lies not in memorizing isolated facts but in understanding the web of relationships between them. Knowledge graphs, attention mechanisms, and causal AI converge here into a unified vision of machine understanding.
Knowledge graphs store entities (people, places, molecules, concepts) as nodes and their semantic relationships as edges — enabling AI to reason about connections, not isolated facts. Google's Knowledge Graph powers the contextual information panels in search results. Graph Neural Networks learn directly from this relational structure, excelling at drug interaction prediction, social network modeling, and scientific discovery.
Attention is fundamentally relational: "how much should token A influence the interpretation of token B?" This dynamic, context-sensitive weighting is the core reason transformers excel at language — they model relationships, not sequences. Extending this relational attention to multi-modal, multi-domain data (text + images + graphs + code) is an active and critical research frontier.
Standard machine learning finds statistical correlations; Causal AI finds cause-and-effect. Judea Pearl's do-calculus framework enables AI to answer counterfactual questions: "What would have happened if we had administered treatment X?" This is essential for medicine, policy, economics, and any high-stakes domain where correlation is dangerously insufficient for decision-making.
IBM's Watson for Drug Discovery uses knowledge graphs linking genes, proteins, diseases, pathways, and compounds across 50+ biomedical databases. It identifies which molecular relationships are causally significant for a target disease — not merely statistically associated — dramatically narrowing the candidate drug space and accelerating the path to clinical trials. Relational reasoning at biological scale.