Arrays are the most fundamental data structure in computer science, providing contiguous storage for elements of the same type. They serve as the building block for many higher-level data structures and are essential for understanding memory layout and access patterns.
Key Characteristics:
Dynamic arrays (also called resizable arrays, array lists, or vectors) extend static arrays with automatic resizing capability, making them practical for scenarios where the size isn't known in advance.
A static array is a fixed-size collection of elements stored in contiguous memory locations. The size is determined at creation and cannot change.
// Array of 5 integers
int arr[5] = {10, 20, 30, 40, 50};
Memory:
Address: 1000 1004 1008 1012 1016
Value: 10 20 30 40 50
Index: [0] [1] [2] [3] [4]
Element address = base_address + (index × element_size)
This formula enables O(1) random access, the defining feature of arrays.
Dynamic arrays maintain a backing static array but automatically resize when capacity is exceeded. This combines array efficiency with flexibility.
Doubling Strategy (Growth Factor = 2):
Initial capacity: 4
After 5 insertions: resize to 8
After 9 insertions: resize to 16
After 17 insertions: resize to 32
Why Doubling Works:
Although a single append may take O(n) time during resize, the average cost per append is O(1) when analyzed over a sequence of operations.
Proof Sketch:
| Factor | Pros | Cons |
|---|---|---|
| 2.0 | Simple, fast amortization | Memory waste up to 50% |
| 1.5 | Better memory reuse | More frequent resizes |
| φ (1.618) | Optimal reuse (Fibonacci) | Complex to implement |
Some implementations also shrink capacity when size drops below 1/4 capacity (to maintain load factor ≥ 0.25). This prevents pathological memory usage but adds complexity.
typedef struct {
int* data; // Pointer to backing array
size_t size; // Current number of elements
size_t capacity; // Current capacity
} DynamicArray;
DynamicArray* create(size_t initial_capacity) {
DynamicArray* arr = malloc(sizeof(DynamicArray));
arr->data = malloc(initial_capacity * sizeof(int));
arr->size = 0;
arr->capacity = initial_capacity;
return arr;
}
void append(DynamicArray* arr, int value) {
if (arr->size == arr->capacity) {
// Resize: allocate 2× capacity
size_t new_capacity = arr->capacity * 2;
int* new_data = malloc(new_capacity * sizeof(int));
// Copy old data
memcpy(new_data, arr->data, arr->size * sizeof(int));
// Free old array, update pointer
free(arr->data);
arr->data = new_data;
arr->capacity = new_capacity;
}
arr->data[arr->size++] = value;
}
void insert(DynamicArray* arr, size_t index, int value) {
if (arr->size == arr->capacity) {
resize(arr); // Same as append
}
// Shift elements right
for (size_t i = arr->size; i > index; i--) {
arr->data[i] = arr->data[i - 1];
}
arr->data[index] = value;
arr->size++;
}
void delete(DynamicArray* arr, size_t index) {
// Shift elements left
for (size_t i = index; i < arr->size - 1; i++) {
arr->data[i] = arr->data[i + 1];
}
arr->size--;
// Optional: shrink if size < capacity/4
if (arr->size < arr->capacity / 4) {
shrink(arr);
}
}
std::vectorArrayListlist (built-in)List<T>Array (built-in)Vec<T>| Operation | Static Array | Dynamic Array (Worst) | Dynamic Array (Amortized) |
|---|---|---|---|
| Access by index | O(1) | O(1) | O(1) |
| Update by index | O(1) | O(1) | O(1) |
| Append (end) | N/A | O(n) | O(1) |
| Insert (beginning) | N/A | O(n) | O(n) |
| Insert (middle) | N/A | O(n) | O(n) |
| Delete (end) | N/A | O(1) | O(1) |
| Delete (beginning) | N/A | O(n) | O(n) |
| Search (unsorted) | O(n) | O(n) | O(n) |
| Search (sorted) | O(log n) | O(log n) | O(log n) |
Arrays excel at cache utilization due to spatial locality:
Cache Line Example:
// Modern CPUs: 64-byte cache lines
// int = 4 bytes → 16 ints per cache line
// Accessing arr[0] likely loads arr[0..15] into cache
✅ Good For:
❌ Poor For:
| Aspect | Array | Linked List |
|---|---|---|
| Access | O(1) | O(n) |
| Insert (front) | O(n) | O(1) |
| Memory overhead | Low | High (pointers) |
| Cache locality | Excellent | Poor |
Rule of thumb: Prefer arrays unless you need frequent O(1) insertions at arbitrary positions.
std::vector<bool> vs std::vector<int> per element?