Application: Traffic flow simulation and prediction
The problem to solve
As autonomous vehicles become more popular and more sensors are deployed on the road, it is now possible to build a better transportation simulation system containing historic data for traffic flow prediction. And a city-scale transportation network simulation will require a combination of statistical skills and computing techniques that can take advantage of abundant hardware. \
The major way to predict the traffic flow situation is machine learning. However, complex neural networks model often takes a long time to train. Therefore it will be extremely useful to scale the training process, leading to drastically decrease in training time and increase in accuracy\cite{zhao2019parallel}. \
Backpropagation neural network (BPNN) is one of the most widely used model due to its excellent function approximation ability. A typical BPNN usually contains three kinds of layers including input layer, hidden layer, and output layer. Input layer is the entrance of the algorithm. It inputs one instance of the data into the network. The dimension of the instance determines the number of inputs in the input layer. Hidden layer contains one or several layers. It outputs intermediate data to the output layer that generates the final output of the neural network. The number of outputs is determined by the encoding of the classified results. In BPNN each layer consists of a number of neurons. The linear functions or nonlinear functions in each neuron are frequently controlled by two kinds of parameters, weight and bias. \
In the training phase, BPNN employs feed forward to generate output. And then it calculates the error between the output and the target output. Afterwards, BPNN employs backpropagation to tune weights and biases in neurons based on the calculated error. In the classifying phase, BPNN only executes feed forward to achieve the ultimate classified result. Although it is difficult to determine an optimal number of the hidden layers and neurons for one classification task, it is proved that a three-layer BPNN is enough to fit the mathematical equations which approximate the mapping relationships between the inputs and the outputs. While due to a large number of mathematical calculations existing in the algorithm, low efficiency of BPNN leads to performance deterioration in both training phase and classification phase when the data size is large. \
Objective
Build a efficient parallel computing framework for traffic flow simulation on large-scale networks. For large scale city, we can divide the transportation into several modules based on its location and function. The simulation of all targets(cars, pedestrians, buses, bicycles) inside one module will be processed individually. And “various types of simulation events are mapped to independent logical processes that can concurrently execute their procedures while maintaining good load balance”\cite{qu2017large}. The connection between each module will be managed by time sequence and location information. \
Build a efficient parallel computing method to accelerate training the deep neural network that predicts the future traffic flow for real-time control. We may build a multi-process training framework for Pytorch or Tensorflow Neural Networks.