Skip to content

D5 · Translation

Spec reference: Section D - Programming Languages
Key idea: Understand why and how high-level code is translated into machine code.


Why do programs need to be translated?

Computers only understand machine code - binary instructions (0s and 1s) specific to a processor's instruction set. Programmers write code in high-level languages (Python, Java, C++) that are readable by humans but cannot be directly executed by a CPU.

Translation converts source code into machine code so the computer can run it.


Types of programming language

LevelDescriptionExample
Machine codeBinary instructions (0s and 1s) directly executed by CPU10110000 01100001
Assembly languageHuman-readable mnemonics representing machine code instructionsMOV AL, 61h
Low-level languageClose to hardware - assemblyAssembly
High-level languageHuman-readable, abstracted from hardwarePython, Java, C#, JavaScript

Three types of translator

1. Assembler

An assembler translates assembly language into machine code.

Assembly: MOV AX, 5    →    Assembler    →    Machine code: 10111000 00000101
  • One assembly instruction → one (or a few) machine code instructions.
  • Used for low-level systems programming, device drivers, embedded systems.

2. Compiler

A compiler translates an entire high-level language program into machine code all at once, before execution.

Source code (Python/C++)  →  Compiler  →  Executable file (.exe)

                                    Run later without re-translating

How compilation works:

  1. Lexical analysis - tokenises the source code (breaks into keywords, identifiers, operators).
  2. Syntax analysis - checks grammar rules (produces a parse tree).
  3. Semantic analysis - checks meaning (are variables declared? correct types?).
  4. Code generation - produces machine code or intermediate code.
  5. Optimisation - improves efficiency of the generated code.

Languages that use compilation: C, C++, Rust, Go.


3. Interpreter

An interpreter translates and executes one line at a time, converting each instruction to machine code immediately before running it.

Source code → [Interpreter reads line 1] → [Executes line 1]
           → [Interpreter reads line 2] → [Executes line 2]
           → ... and so on

Languages that use interpretation: Python, JavaScript (in browser), Ruby, PHP.


Compiler vs Interpreter - comparison

CompilerInterpreter
TranslatesEntire program at onceOne line at a time
OutputExecutable file (.exe)No separate output file
Execution speedFast (translated once, run many times)Slower (translated during every run)
Error reportingAll errors reported at once after analysisStops at the first error encountered
PortabilityCompiled code is platform-specificSource code can run anywhere with interpreter
DevelopmentSlower edit-compile-run cycleFaster to test - run immediately
ExamplesC, C++, RustPython, JavaScript, Ruby

Reasons for translation

Programs need to be translated because:

  1. CPUs only understand machine code - binary instructions.
  2. High-level languages are hardware-independent - the same Python code runs on Windows, Mac, Linux.
  3. Human productivity - writing in Python is vastly more productive than writing in binary.
  4. Portability - write once, translate for different platforms.

Benefits of using high-level languages

BenefitDetail
ReadabilityCode resembles English - easier to write, read and maintain
PortabilitySame source code can be translated for different hardware
AbstractionHardware details are hidden - focus on solving the problem
ProductivityOne high-level statement may replace many machine code instructions
Rich librariesAccess to huge collections of pre-written functions

Drawbacks of translation

DrawbackDetail
Compilation overheadCompilation step adds time before execution
Interpreted programs are slowerTranslation on-the-fly is slower than pre-compiled code
Platform dependencyCompiled code may not run on a different CPU architecture
Source code exposureInterpreted programs need the source code present - easier to copy

Just-In-Time (JIT) compilation

Some languages (Java, C#, JavaScript in V8) use JIT compilation - a hybrid approach:

  1. Source code is compiled to bytecode (intermediate, platform-independent).
  2. At runtime, a JIT compiler converts bytecode to machine code just before it is needed.
  3. Frequently-run code is compiled once and cached, giving near-native speed.
Java source → Java compiler → Bytecode (.class)

                            JVM + JIT compiler

                            Machine code (at runtime)

Summary

TranslatorInputOutputWhen?
AssemblerAssembly languageMachine codeBefore execution
CompilerHigh-level sourceExecutable (machine code)Before execution
InterpreterHigh-level sourceExecutes directlyDuring execution
JIT compilerBytecodeMachine codeDuring execution (optimised)

Test Yourself

Question 1 of 5

Why must high-level programs be translated?

Ad

PassMaven - revision made simple.