How well do we understand the neural origins of the fMRI BOLD signal? – Arthurs & Boniface, 2002
Functional magnetic resonance imaging (fMRI) is a very popular non-invasive tool used to image functionally active brain regions. They use the BOLD method which means blood oxygenation level-dependent and it relies on hemoglobin as the endogenous contrast agent. Specifically observing the magnetization difference between oxygenated and deoxygenated hemoglobin. Exact nature of this occurrence is still unknown, but BOLD responses increase directly with neural activity correlating particularly with local field potential (LFP) measures. These measures represent synchronized synaptic inputs. Another method known as Evoked potentials are very similar to LFPs but are from extracellular currents from summated postsynaptic potentials – in turn representing a population of synaptic activity rather than neuronal firing states.
BOLD signal is affected by many variables like the use of different anesthetics which would affect metabolic and cellular activity. Neuronal activity linearly correlated with hemodynamic activity. Early reports suggested that action potentials and synaptic activity correlate with the BOLD signal and may contribute to it. However, recent evidence suggests that action potential activity of cortical cells only takes up 3% of resting energy consumption, thus is minimal even if doubled during activation. In terms of synaptic activity, most of energy and oxygen consumption occurs post activity and takes up to 95%. Therefore as expected synaptic activity is related to fMRI BOLD signals.

Relevance of the relationship between action potentials and synaptic activity
Basically this is a very complicated relationship. Action potentials are defined as when the membrane potential reaches threshold by depolarization which in turn is determined by integrating the incoming postsynaptic potentials, both excitatory and inhibitory. Excitatory increases chance for threshold depolarization while inhibitory is the opposite, and to reach threshold they would need 75 afferent neurons to fire simultaneously to see a 10mV change in depolarization. This is for a single post synaptic neuron. Spiking activity adapts quickly to this while synaptic LFP activity may be maintained during stimulus presentation.
However you would expect a linear relationship between action potential firing rate and synaptic metabolic activity to prevent info loss between axon and dendrite of the same neuron. Check paper for more info.
fMRI BOLD detects population activity:
Best current resolution is level of one cortical column containing 10^5 neurons. Most compromise shorter acquisition times meaning the shot would take longer thus less temporally accurate, for a spatial resolution of 8-50mm^3 containing at least 10^6 neurons. MRI Voxels: An MRI image is composed of a number of voxels; the voxel size is the spatial resolution of the image. A voxel is a 3D unit of the image with a single value, just as for digital photographs a pixel is a 2D unit of the image with a single value. Given the scale of imaging for an effect to be seen, it is still unclear if this technology could differentiate large changes in small populations of cells or small changes in a large population of cells.
Another debate with fMRI imaging is whether it can account for changes in overall population or background activity. The background population is constantly changing in neuronal firing and taking these into account is important for noticing more important local changes. However by ignoring these background processes we also may lose valuable small processes that are critical for the function to occur. For example arousal and attention are important cognitive states that may not fire a significant net change. Given that fMRI signals represent global synaptic activity levels they might take into account the sensitive changes in synchronization with normal firing rates.
Definitely check paper for more info.