How to plot data chronologically

Solution for How to plot data chronologically
is Given Below:

I am using matplotlib to graph my results from a .dat file.

The data is as follows

1145, 2021-07-17 00:00:00, bob, rome, 12.75, 65.0, 162.75
1146, 2021-07-12 00:00:00, billy larkin, italy, 93.75, 325.0, 1043.75
114, 2021-07-28 00:00:00, beatrice, rome, 1, 10, 100
29, 2021-07-25 00:00:00, Colin, italy the third, 10, 10, 50
5, 2021-07-22 00:00:00, Veronica, canada, 10, 100, 1000
1149, 1234-12-13 00:00:00, Billy Larkin, 1123, 12.75, 65.0, 162.75

I want to print a years worth of data (Jan to Dec) in the proper sequence and have my labels show up as the months, instead of the long date.

Here is my code:

import matplotlib.pyplot as plt
import csv

x = []
y = []

with open('Claims.dat','r') as csvfile:
    #bar = csv.reader(csvfile, delimiter=",")
    plot = csv.reader(csvfile, delimiter=",")

    for row in plot:
        x.append(str(row[1]))
        y.append(str(row[6]))

plt.plot(x,y, label="Travel Claim Totals!", color="red", marker="o")
plt.xlabel('Months', color="red", size="large")

plt.ylabel('Totals', color="red", size="large")
plt.title('Claims Data:   Team Bobbyn Second Place is the First Looser', color="Blue", weight="bold", size="large")

plt.xticks(rotation=45, horizontalalignment="right", size="small")
plt.yticks(weight="bold", size="small", rotation=45)

plt.legend()
plt.subplots_adjust(left=0.2, bottom=0.40, right=0.94, top=0.90, wspace=0.2, hspace=0)
plt.show()

enter image description here

I think the easiest way is to resort the data based on the date, which can be constructed using the datetime package. Here is a min working example, based on your data

import datetime

def isfloat(value: str):
  try:
    float(value)
    return True
  except ValueError:
    return False

def isdatetime(value: str):
  try:
    datetime.datetime.fromisoformat(value)
    return True
  except ValueError:
    return False

data = r"""1145, 2021-07-17 00:00:00, bob, rome, 12.75, 65.0, 162.75
1146, 2021-07-12 00:00:00, billy larkin, italy, 93.75, 325.0, 1043.75
114, 2021-07-28 00:00:00, beatrice, rome, 1, 10, 100
29, 2021-07-25 00:00:00, Colin, italy the third, 10, 10, 50
5, 2021-07-22 00:00:00, Veronica, canada, 10, 100, 1000
1149, 1234-12-13 00:00:00, Billy Larkin, 1123, 12.75, 65.0, 162.75"""

for idx in range(len(data)):
  data[idx] = data[idx].split(', ')
  for jdx in range(len(data[idx])):
    if data[idx][jdx].isnumeric():    # Is it an integer?
      value = int(data[idx][jdx])
    elif isfloat(data[idx][jdx]):     # Is it a float?
      value = float(data[idx][jdx])
    elif isdatetime(data[idx][jdx]):  # Is it a date?
      value = datetime.datetime.fromisoformat(data[idx][jdx])
    else:
      value = data[idx][jdx]
    data[idx][jdx] = value

data.sort(key=lambda x: x[1])

You can also sort by more specific things:

data.sort(key=lambda x: x[1].month)

Note: You might not need all the logic in the for-loop. I think the csv package does some basic preprocessing for you, such as splitting and data type conversion.

  • The easiest solution is to use pandas
  • In the sample data, '1234-12-13' was changed to '2020-12-13' since '1234' isn’t a valid year.
  • If you aren’t allowed to use pandas, then please see How to read, format, sort, and save a csv file, without pandas
  • Using pandas 1.3.0 and matplotlib 3.4.2

Imports and DataFrame

import pandas as pd
import matplotlib.dates as mdates  # used to format the x-axis
import matplotlib.pyplot as plt

# read in the data
df = pd.read_csv('Claims.dat', header=None)

# convert the column to a datetime format, which ensures the data points will be plotted in chronological order
df[1] = pd.to_datetime(df[1], errors="coerce").dt.date

# display(df)
      0           1              2                 3      4      5        6
0  1145  2021-07-17            bob              rome  12.75   65.0   162.75
1  1146  2021-07-12   billy larkin             italy  93.75  325.0  1043.75
2   114  2021-07-28       beatrice              rome   1.00   10.0   100.00
3    29  2021-07-25          Colin   italy the third  10.00   10.0    50.00
4     5  2021-07-22       Veronica            canada  10.00  100.0  1000.00
5  1149  2020-12-13   Billy Larkin              1123  12.75   65.0   162.75

Plotting the DataFrame

# plot the dataframe, which uses matplotlib as the backend
ax = df.plot(x=1, y=6, marker=".", color="r", figsize=(10, 7), label="Totals")

# format title and labels
ax.set_xlabel('Months', color="red", size="large")
ax.set_ylabel('Totals', color="red", size="large")
ax.set_title('Claims Data:   Team Bobbyn Second Place is the First Looser', color="Blue", weight="bold", size="large")

# format ticks
xt = plt.xticks(rotation=45, horizontalalignment="right", size="small")
yt = plt.yticks(weight="bold", size="small", rotation=45)

# format the dates on the xaxis
myFmt = mdates.DateFormatter('%b')
ax.xaxis.set_major_formatter(myFmt)

enter image description here