🐍 Stock Analysis

Introduction

With this document I want to show you how you can aggregate up-to-date data of stocks. To aggregate data we will use the online service Alpha Vantage. Alpha Vantage provides a free service with well documented REST services.

First we will aggregate the relevant data from Alpha Vantage. Second we will create data visualization to show line charts and histograms.

Stock Symbol

First things first. Before we aggregate data from Alpha Vantage, we create a table (as comma separated file). This table contains the relevant symbols used by the Alpha Vanta API.

Table 1: stocks_tbl
symbol	name
22UA.FRK	BioNTech SE
AIR.DEX	Airbus SE
BAS.DEX	BASF SE
BAYN.DEX	Bayer AG
BYW.DEX	BayWa Aktiengesellschaft
BMW.DEX	Bayerische Motoren Werke AG
CON.DEX	Continental AG
DAI.DEX	Daimler AG
DTE.DEX	Deutsche Telekom AG
EKT.DEX	Energiekontor AG
ENR.DEX	Siemens Energy AG
EOAN.DEX	E.ON SE
EVK.DEX	Evonik Industries AG
GTQ1.FRK	Siemens Gamesa Renewable Energy
IFX.DEX	Infineon Technologies AG
NDX1.DEX	Nordex SE
RWE.DEX	RWE AG
SAP.DEX	Sap SE
SIE.DEX	Siemens AG
UN01.DEX	Uniper SE
VAR1.DEX	Varta AG

# -*- mode: python; coding: utf-8; -*-
"""Write the stocks table to a csv file."""
import csv
from pathlib import Path

wd = Path.home().joinpath('Documents/journal/data/stocks')

with wd.joinpath('stocks_tbl.csv').open('w', encoding='utf-8') as fp:
    writer = csv.writer(fp, delimiter=',', quoting=csv.QUOTE_MINIMAL)
    writer.writerow(stocks_tbl[0])
    for row in stocks_tbl[1:]:
        writer.writerow(row)

None

Alpha Vantage API

Alpha Vantage provides enterprise-grade financial market data through a set of powerful and developer-friendly APIs. From traditional asset classes (e.g., stocks and ETFs) to forex and cryptocurrencies, from fundamental data to technical indicators, Alpha Vantage is your one-stop-shop for global market data delivered through cloud-based APIs, Excel, and Google Sheets. ---https://www.alphavantage.co/#about

The following python script automatically downloads time series data as json files for every stock in our stocks_tbl.csv file for every possible periode.

Before we discuss this script in details, I will provide you the complete script to play around with it. You can also download it from this location.

#!/usr/bin/env python3
# -*- coding: utf-8; mode: python -*-
"""Get stocks data from alphavantage API.

Introduction
============

Download an arbitrary number of time series data from Alpha Vantage API With
respect to API restrictions. These API restrictions means with a free account
you can only do max 5 API connections per 60 seconds. For more information
related to Alpha Vantage, please check the `Alpha Vantage Website`_:

.. _Alpha Vantage Website: https://www.alphavantage.co/documentation/

Working Directory
=================

The user of this script is able to define the working directory. This directory
is used to save the downloaded json files and to read the needed csv file.

CSV File
========

The user of this script is able to define the location of a csv file. This csv
file should provide a column with symbols of stocks which are used by Alpha
Vantage. For instance:

#+begin_example
| symbol  | name   |
|---------|--------|
| AIR.DEX | Airbus |
#+end_example

If a location is not provided by the user, the location will be set by the
script. The csv file should always be relative to the working directory. The
default name for the csv file is `portfolio.csv`.

:Author: Marcus Kammer
:E-mail: marcus.kammer@mailbox.org
:Date: 2021-11-10

"""

# Copyright (C) 2022 Marcus Kammer

# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.

# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.

# You should have received a copy of the GNU General Public License
# along with this program.  If not, see https://www.gnu.org/licenses/.

import argparse
import csv
import json
import logging
import os
import sys
import time
from urllib import request, error
from pathlib import Path


wd = Path().home().joinpath('Documents/org/data/stocks')

if os.getenv('STOCK_APP_WD', False):
    wd = Path(os.getenv('STOCK_APP_WD')).expanduser().resolve()


class TimeSeries:
    """Represents the time series api end point.

    Constructs url and filename for I/O operations. The output size for daily
    time series data are 'compact'.

    """
    _base_url = 'https://www.alphavantage.co/query?'
    _apikey = os.getenv('ALPHAVANTAGE')

    def __init__(self, stock, periode, adjusted=False, dtype='json'):
        self.dtype = dtype
        self.params = {'apikey': __class__._apikey,
                       'function': f'TIME_SERIES_{periode.upper()}',
                       'symbol': stock.upper(),
                       'datatype': dtype}

        if adjusted:
            self.params['function'] = self.params['function'] + '_ADJUSTED'

        if periode == 'daily':
            self.params['outputsize'] = 'compact'

    def __repr__(self):
        return self.url

    @property
    def url(self):
        url = [f'&{key}={value}' for key, value in self.params.items()]
        return __class__._base_url + ''.join(url)

    @property
    def filename(self):
        return (f"{self.params['function']}"
                f"-{self.params['symbol']}.{self.dtype}")


def request_api(url: str) -> dict:
    """Download data from `url`. And return data as json."""
    logging.info(f'Download {url}')
    try:
        r = request.urlopen(url)
    except error.HTTPError as http_error:
        if http_error.code == 401:
            sys.exit("Access denied. Check your API key.")
        elif http_error.code == 404:
            sys.exit("Can't find time series data.")
        else:
            sys.exit(f"Something went wrong... ({http_error.code})")
    else:
        data = r.read()

    try:
        j = json.loads(data)
    except json.JSONDecodeError:
        sys.exit("Couldn't read the server response.")
    else:
        return j


def eval_json(keys: list, _json: dict) -> dict:
    """Check if `_json` include key from API service.

    If, exit program with message. If not, return `_json`.

    """
    for key in keys:
        if key in _json:
            sys.exit(_json[key])
        else:
            return _json


def write_json(filename: str, _json: dict) -> None:
    """Write `json` to `filename` and print a message."""
    try:
        fp = wd.joinpath(filename).open('w')
        json.dump(_json, fp)
    except FileNotFoundError as err:
        logging.exception(err)
        sys.exit(f'Could not write {fp.name}')
    except TypeError as err:
        logging.exception(err)
        sys.exit(f'Could not write {fp.name}')
    else:
        logging.info(f'{fp.name} was written')


def service_call(symbols: list, periode: str) -> None:
    for chunk in symbols:
        for symbol in chunk:
            ts = TimeSeries(symbol, periode)
            response = eval_json(['Error Message', 'Note'], request_api(ts.url))
            write_json(ts.filename, response)
        print('\nYou have to wait 60 seconds, because of API restrictions.\n')
        time.sleep(60)


def chunks(_list: list, nth: int) -> list:
    """Split a bigger list into smaller lists.

    >>> chunks([1,2,3,4,5,6,7,8,9], 3)
    [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

    >>> chunks([1,2,3,4,5,6,7,8,9,10], 3)
    [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]

    """
    return [_list[i:i + nth] for i in range(0, len(_list), nth)]


def read_csv(filename='portfolio.csv') -> list:
    """Read `filename` as csv file.
    And return a list of lists
    """
    try:
        fp = wd.joinpath(filename).open('r')
        reader = csv.reader(fp, delimiter=',')
    except FileNotFoundError:
        sys.exit(f'Could not find {wd.joinpath(filename)}\n'
                 f'csv file should be placed relative to {wd}')
    else:
        logging.info(f'Csv File: {fp.name}')
        return [row for row in reader]


def read_args():
    """Users are able to input custom arguments.

    -wd, --working-directory: directory which the script should operates in
    -p, --portfolio: name of the csv file with stock symbols

    """
    parser = argparse.ArgumentParser(
        description='Download arbitrary time series data.'
    )

    parser.add_argument(
        '-wd',
        '--working-directory',
        type=str, help='working directory which the script should operates in.'
    )

    parser.add_argument(
        '-p',
        '--portfolio',
        type=str, help='name of the csv file with stock symbols.'
    )

    return parser.parse_args()


def main():
    user_args = read_args()
    if user_args.working_directory:
        global wd
        wd = Path(user_args.working_directory).expanduser().resolve()

    wd.mkdir(parents=True, exist_ok=True)
    logging.info(f'Working Directory: {wd}')

    if user_args.portfolio:
        sym_col = read_csv(user_args.portfolio)
    else:
        sym_col = read_csv()

    symbols = chunks([col[0] for col in sym_col[1:]], 5)
    start = time.time()

    periods = ['daily', 'weekly', 'monthly']
    for period in periods:
        service_call(symbols, period)

    sys.exit(f'It tooks {(time.time() - start) / 60} '
             'minutes to download json files.\n'
             f'working directory: {wd}')


if __name__ == '__main__':
    logconf = {'format': '%(asctime)s %(message)s',
               'level': logging.DEBUG}
    logging.basicConfig(**logconf)
    main()

After running this script, you should see some kind of files names like the following files in your script working directory:

/data/stocks/TIME_SERIES_DAILY-22UA.FRK.json
/data/stocks/TIME_SERIES_WEEKLY-22UA.FRK.json
/data/stocks/TIME_SERIES_MONTHLY-22UA.FRK.json

For the next chapters we will work with the \*TIME_SERIES_MONTHLY-\* json files.

Dashboard / Data Visualization

Explore the json files / data sets

Introduction to json

Most REST APIs provides json files to work with. The internal representation of json files in Python is a mix of dict and list. Usually a json file starts with a javascript object which is represented by an Python dict.

The dict class provides us the .items methods which allows us to iterate over the items of a dict, create a tuple with key and values which then can be sorted.

print(help(dict.items))

The dict class also provides a .keys method which returns the keys of an dict.

print(help(dict.keys))

Get familiar with a stock data json file

Pick a json file and lets have a look what keys it provides us:

import json
from pathlib import Path

fp = Path.home().joinpath('Documents/journal/data/stocks/TIME_SERIES_MONTHLY-AIR.DEX.json')
return json.load(fp.open('r')).keys()

import json
from pathlib import Path

key = 'Meta Data'
fp = Path.home().joinpath('Documents/journal/data/stocks/TIME_SERIES_MONTHLY-AIR.DEX.json')
return list(json.load(fp.open('r'))[key].items())

import json
from pathlib import Path

key = 'Monthly Time Series'
fp = Path.home().joinpath('Documents/journal/data/stocks/TIME_SERIES_MONTHLY-AIR.DEX.json')
return list(json.load(fp.open('r'))[key].items())[:3]

Get familiar with all json files

Let us see if all json files share the same number of entries for the key ’Monthly Time Series’:

import json
from pathlib import Path

key = 'Monthly Time Series'
counter = {}
wd = Path.home().joinpath('Documents/journal/data/stocks') # working directory
for fp in wd.glob('TIME_SERIES_MONTHLY-*.json'):
    counter[str(fp).split('/')[-1]] = len(json.load(fp.open())[key])
return sorted([(v, k) for k, v in counter.items()], reverse=True)

The stock with the smallest set of entries is ENR.DEX, but most of the stocks shares the same number of entries: 204.

Extract time series data

The next step is to read and extract the symbol and the time series data ’Monthly Time Series’. We will extract the key ’4. close’.

import json
from pathlib import Path

wd = Path.home().joinpath('Documents/journal/data/stocks')
fp = wd.joinpath('TIME_SERIES_MONTHLY-ENR.DEX.json')
stock = json.load(fp.open('r'))
symbol = stock['Meta Data']['2. Symbol']
time_series = [(el[0], el[1]['4. close'])
               for el in stock['Monthly Time Series'].items()]
return (symbol, time_series)

('ENR.DEX',
 [('2022-01-21', '19.1250'),
  ('2021-12-30', '22.4900'),
  ('2021-11-30', '23.4400'),
  ('2021-10-29', '24.8200'),
  ('2021-09-30', '23.2300'),
  ('2021-08-31', '24.5800'),
  ('2021-07-30', '22.9400'),
  ('2021-06-30', '25.4200'),
  ('2021-05-31', '26.0000'),
  ('2021-04-30', '27.8000'),
  ('2021-03-31', '30.6100'),
  ('2021-02-26', '31.2500'),
  ('2021-01-29', '30.5800'),
  ('2020-12-30', '30.0000'),
  ('2020-11-30', '24.9000'),
  ('2020-10-30', '18.8000')])