Appearance
D5 · Translation
Spec reference: Section D - Programming Languages
Key idea: Understand why and how high-level code is translated into machine code.
Why do programs need to be translated?
Computers only understand machine code - binary instructions (0s and 1s) specific to a processor's instruction set. Programmers write code in high-level languages (Python, Java, C++) that are readable by humans but cannot be directly executed by a CPU.
Translation converts source code into machine code so the computer can run it.
Types of programming language
| Level | Description | Example |
|---|---|---|
| Machine code | Binary instructions (0s and 1s) directly executed by CPU | 10110000 01100001 |
| Assembly language | Human-readable mnemonics representing machine code instructions | MOV AL, 61h |
| Low-level language | Close to hardware - assembly | Assembly |
| High-level language | Human-readable, abstracted from hardware | Python, Java, C#, JavaScript |
Three types of translator
1. Assembler
An assembler translates assembly language into machine code.
Assembly: MOV AX, 5 → Assembler → Machine code: 10111000 00000101- One assembly instruction → one (or a few) machine code instructions.
- Used for low-level systems programming, device drivers, embedded systems.
2. Compiler
A compiler translates an entire high-level language program into machine code all at once, before execution.
Source code (Python/C++) → Compiler → Executable file (.exe)
↓
Run later without re-translatingHow compilation works:
- Lexical analysis - tokenises the source code (breaks into keywords, identifiers, operators).
- Syntax analysis - checks grammar rules (produces a parse tree).
- Semantic analysis - checks meaning (are variables declared? correct types?).
- Code generation - produces machine code or intermediate code.
- Optimisation - improves efficiency of the generated code.
Languages that use compilation: C, C++, Rust, Go.
3. Interpreter
An interpreter translates and executes one line at a time, converting each instruction to machine code immediately before running it.
Source code → [Interpreter reads line 1] → [Executes line 1]
→ [Interpreter reads line 2] → [Executes line 2]
→ ... and so onLanguages that use interpretation: Python, JavaScript (in browser), Ruby, PHP.
Compiler vs Interpreter - comparison
| Compiler | Interpreter | |
|---|---|---|
| Translates | Entire program at once | One line at a time |
| Output | Executable file (.exe) | No separate output file |
| Execution speed | Fast (translated once, run many times) | Slower (translated during every run) |
| Error reporting | All errors reported at once after analysis | Stops at the first error encountered |
| Portability | Compiled code is platform-specific | Source code can run anywhere with interpreter |
| Development | Slower edit-compile-run cycle | Faster to test - run immediately |
| Examples | C, C++, Rust | Python, JavaScript, Ruby |
Reasons for translation
Programs need to be translated because:
- CPUs only understand machine code - binary instructions.
- High-level languages are hardware-independent - the same Python code runs on Windows, Mac, Linux.
- Human productivity - writing in Python is vastly more productive than writing in binary.
- Portability - write once, translate for different platforms.
Benefits of using high-level languages
| Benefit | Detail |
|---|---|
| Readability | Code resembles English - easier to write, read and maintain |
| Portability | Same source code can be translated for different hardware |
| Abstraction | Hardware details are hidden - focus on solving the problem |
| Productivity | One high-level statement may replace many machine code instructions |
| Rich libraries | Access to huge collections of pre-written functions |
Drawbacks of translation
| Drawback | Detail |
|---|---|
| Compilation overhead | Compilation step adds time before execution |
| Interpreted programs are slower | Translation on-the-fly is slower than pre-compiled code |
| Platform dependency | Compiled code may not run on a different CPU architecture |
| Source code exposure | Interpreted programs need the source code present - easier to copy |
Just-In-Time (JIT) compilation
Some languages (Java, C#, JavaScript in V8) use JIT compilation - a hybrid approach:
- Source code is compiled to bytecode (intermediate, platform-independent).
- At runtime, a JIT compiler converts bytecode to machine code just before it is needed.
- Frequently-run code is compiled once and cached, giving near-native speed.
Java source → Java compiler → Bytecode (.class)
↓
JVM + JIT compiler
↓
Machine code (at runtime)Summary
| Translator | Input | Output | When? |
|---|---|---|---|
| Assembler | Assembly language | Machine code | Before execution |
| Compiler | High-level source | Executable (machine code) | Before execution |
| Interpreter | High-level source | Executes directly | During execution |
| JIT compiler | Bytecode | Machine code | During execution (optimised) |
Test Yourself
Question 1 of 5
Why must high-level programs be translated?