Within a decade, single trial analysis of functional Near Infrared Spectroscopy (fNIRS) signals has gained significant momentum, and fNIRS joined the set of modalities frequently used for active and passive Brain Computer Interfaces (BCI). A great variety of methods for feature extraction and classification have been explored using state-of-the-art Machine Learning methods. In contrast, signal preprocessing and cleaning pipelines for fNIRS often follow simple recipes and so far rarely incorporate the available state-of-the-art in adjacent fields. In neuroscience, where fMRI and fNIRS are established neuroimaging tools, evoked hemodynamic brain activity is typically estimated across multiple trials using a General Linear Model (GLM). With the help of the GLM, subject, channel, and task specific evoked hemodynamic responses are estimated, and the evoked brain activity is more robustly separated from systemic physiological interference using independent measures of nuisance regressors, such as short-separation fNIRS measurements. When correctly applied in single trial analysis, e.g., in BCI, this approach can significantly enhance contrast to noise ratio of the brain signal, improve feature separability and ultimately lead to better classification accuracy. In this manuscript, we provide a brief introduction into the GLM and show how to incorporate it into a typical BCI preprocessing pipeline and cross-validation. Using a resting state fNIRS data set augmented with synthetic hemodynamic responses that provide ground truth brain activity, we compare the quality of commonly used fNIRS features for BCI that are extracted from (1) conventionally preprocessed signals, and (2) signals preprocessed with the GLM and physiological nuisance regressors. We show that the GLM-based approach can provide better single trial estimates of brain activity as well as a new feature type, i.e., the weight of the individual and channel-specific hemodynamic response function (HRF) regressor. The improved estimates yield features with higher separability, that significantly enhance accuracy in a binary classification task when compared to conventional preprocessing—on average +7.4% across subjects and feature types. We propose to adapt this well-established approach from neuroscience to the domain of single-trial analysis and preprocessing wherever the classification of evoked brain activity is of concern, for instance in BCI.