🐍 Stock Analysis
Introduction
With this document I want to show you how you can aggregate up-to-date data of stocks. To aggregate data we will use the online service Alpha Vantage. Alpha Vantage provides a free service with well documented REST services.
First we will aggregate the relevant data from Alpha Vantage. Second we will create data visualization to show line charts and histograms.
Stock Symbol
First things first. Before we aggregate data from Alpha Vantage, we create a table (as comma separated file). This table contains the relevant symbols used by the Alpha Vanta API.
symbol | name |
---|---|
22UA.FRK | BioNTech SE |
AIR.DEX | Airbus SE |
BAS.DEX | BASF SE |
BAYN.DEX | Bayer AG |
BYW.DEX | BayWa Aktiengesellschaft |
BMW.DEX | Bayerische Motoren Werke AG |
CON.DEX | Continental AG |
DAI.DEX | Daimler AG |
DTE.DEX | Deutsche Telekom AG |
EKT.DEX | Energiekontor AG |
ENR.DEX | Siemens Energy AG |
EOAN.DEX | E.ON SE |
EVK.DEX | Evonik Industries AG |
GTQ1.FRK | Siemens Gamesa Renewable Energy |
IFX.DEX | Infineon Technologies AG |
NDX1.DEX | Nordex SE |
RWE.DEX | RWE AG |
SAP.DEX | Sap SE |
SIE.DEX | Siemens AG |
UN01.DEX | Uniper SE |
VAR1.DEX | Varta AG |
# -*- mode: python; coding: utf-8; -*- """Write the stocks table to a csv file.""" import csv from pathlib import Path wd = Path.home().joinpath('Documents/journal/data/stocks') with wd.joinpath('stocks_tbl.csv').open('w', encoding='utf-8') as fp: writer = csv.writer(fp, delimiter=',', quoting=csv.QUOTE_MINIMAL) writer.writerow(stocks_tbl[0]) for row in stocks_tbl[1:]: writer.writerow(row)
None
Alpha Vantage API
Alpha Vantage provides enterprise-grade financial market data through a set of powerful and developer-friendly APIs. From traditional asset classes (e.g., stocks and ETFs) to forex and cryptocurrencies, from fundamental data to technical indicators, Alpha Vantage is your one-stop-shop for global market data delivered through cloud-based APIs, Excel, and Google Sheets. ---https://www.alphavantage.co/#about
The following python script automatically downloads time series data as json files for every stock in our stocks_tbl.csv file for every possible periode.
Before we discuss this script in details, I will provide you the complete script to play around with it. You can also download it from this location.
#!/usr/bin/env python3 # -*- coding: utf-8; mode: python -*- """Get stocks data from alphavantage API. Introduction ============ Download an arbitrary number of time series data from Alpha Vantage API With respect to API restrictions. These API restrictions means with a free account you can only do max 5 API connections per 60 seconds. For more information related to Alpha Vantage, please check the `Alpha Vantage Website`_: .. _Alpha Vantage Website: https://www.alphavantage.co/documentation/ Working Directory ================= The user of this script is able to define the working directory. This directory is used to save the downloaded json files and to read the needed csv file. CSV File ======== The user of this script is able to define the location of a csv file. This csv file should provide a column with symbols of stocks which are used by Alpha Vantage. For instance: #+begin_example | symbol | name | |---------|--------| | AIR.DEX | Airbus | #+end_example If a location is not provided by the user, the location will be set by the script. The csv file should always be relative to the working directory. The default name for the csv file is `portfolio.csv`. :Author: Marcus Kammer :E-mail: marcus.kammer@mailbox.org :Date: 2021-11-10 """ # Copyright (C) 2022 Marcus Kammer # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # You should have received a copy of the GNU General Public License # along with this program. If not, see https://www.gnu.org/licenses/. import argparse import csv import json import logging import os import sys import time from urllib import request, error from pathlib import Path wd = Path().home().joinpath('Documents/org/data/stocks') if os.getenv('STOCK_APP_WD', False): wd = Path(os.getenv('STOCK_APP_WD')).expanduser().resolve() class TimeSeries: """Represents the time series api end point. Constructs url and filename for I/O operations. The output size for daily time series data are 'compact'. """ _base_url = 'https://www.alphavantage.co/query?' _apikey = os.getenv('ALPHAVANTAGE') def __init__(self, stock, periode, adjusted=False, dtype='json'): self.dtype = dtype self.params = {'apikey': __class__._apikey, 'function': f'TIME_SERIES_{periode.upper()}', 'symbol': stock.upper(), 'datatype': dtype} if adjusted: self.params['function'] = self.params['function'] + '_ADJUSTED' if periode == 'daily': self.params['outputsize'] = 'compact' def __repr__(self): return self.url @property def url(self): url = [f'&{key}={value}' for key, value in self.params.items()] return __class__._base_url + ''.join(url) @property def filename(self): return (f"{self.params['function']}" f"-{self.params['symbol']}.{self.dtype}") def request_api(url: str) -> dict: """Download data from `url`. And return data as json.""" logging.info(f'Download {url}') try: r = request.urlopen(url) except error.HTTPError as http_error: if http_error.code == 401: sys.exit("Access denied. Check your API key.") elif http_error.code == 404: sys.exit("Can't find time series data.") else: sys.exit(f"Something went wrong... ({http_error.code})") else: data = r.read() try: j = json.loads(data) except json.JSONDecodeError: sys.exit("Couldn't read the server response.") else: return j def eval_json(keys: list, _json: dict) -> dict: """Check if `_json` include key from API service. If, exit program with message. If not, return `_json`. """ for key in keys: if key in _json: sys.exit(_json[key]) else: return _json def write_json(filename: str, _json: dict) -> None: """Write `json` to `filename` and print a message.""" try: fp = wd.joinpath(filename).open('w') json.dump(_json, fp) except FileNotFoundError as err: logging.exception(err) sys.exit(f'Could not write {fp.name}') except TypeError as err: logging.exception(err) sys.exit(f'Could not write {fp.name}') else: logging.info(f'{fp.name} was written') def service_call(symbols: list, periode: str) -> None: for chunk in symbols: for symbol in chunk: ts = TimeSeries(symbol, periode) response = eval_json(['Error Message', 'Note'], request_api(ts.url)) write_json(ts.filename, response) print('\nYou have to wait 60 seconds, because of API restrictions.\n') time.sleep(60) def chunks(_list: list, nth: int) -> list: """Split a bigger list into smaller lists. >>> chunks([1,2,3,4,5,6,7,8,9], 3) [[1, 2, 3], [4, 5, 6], [7, 8, 9]] >>> chunks([1,2,3,4,5,6,7,8,9,10], 3) [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]] """ return [_list[i:i + nth] for i in range(0, len(_list), nth)] def read_csv(filename='portfolio.csv') -> list: """Read `filename` as csv file. And return a list of lists """ try: fp = wd.joinpath(filename).open('r') reader = csv.reader(fp, delimiter=',') except FileNotFoundError: sys.exit(f'Could not find {wd.joinpath(filename)}\n' f'csv file should be placed relative to {wd}') else: logging.info(f'Csv File: {fp.name}') return [row for row in reader] def read_args(): """Users are able to input custom arguments. -wd, --working-directory: directory which the script should operates in -p, --portfolio: name of the csv file with stock symbols """ parser = argparse.ArgumentParser( description='Download arbitrary time series data.' ) parser.add_argument( '-wd', '--working-directory', type=str, help='working directory which the script should operates in.' ) parser.add_argument( '-p', '--portfolio', type=str, help='name of the csv file with stock symbols.' ) return parser.parse_args() def main(): user_args = read_args() if user_args.working_directory: global wd wd = Path(user_args.working_directory).expanduser().resolve() wd.mkdir(parents=True, exist_ok=True) logging.info(f'Working Directory: {wd}') if user_args.portfolio: sym_col = read_csv(user_args.portfolio) else: sym_col = read_csv() symbols = chunks([col[0] for col in sym_col[1:]], 5) start = time.time() periods = ['daily', 'weekly', 'monthly'] for period in periods: service_call(symbols, period) sys.exit(f'It tooks {(time.time() - start) / 60} ' 'minutes to download json files.\n' f'working directory: {wd}') if __name__ == '__main__': logconf = {'format': '%(asctime)s %(message)s', 'level': logging.DEBUG} logging.basicConfig(**logconf) main()
After running this script, you should see some kind of files names like the following files in your script working directory:
/data/stocks/TIME_SERIES_DAILY-22UA.FRK.json /data/stocks/TIME_SERIES_WEEKLY-22UA.FRK.json /data/stocks/TIME_SERIES_MONTHLY-22UA.FRK.json
For the next chapters we will work with the \*TIME_SERIES_MONTHLY-\* json files.
Dashboard / Data Visualization
Explore the json files / data sets
Introduction to json
Most REST APIs provides json files to work with. The internal representation of json files in Python is a mix of dict and list. Usually a json file starts with a javascript object which is represented by an Python dict.
The dict class provides us the .items methods which allows us to iterate over the items of a dict, create a tuple with key and values which then can be sorted.
print(help(dict.items))
The dict class also provides a .keys method which returns the keys of an dict.
print(help(dict.keys))
Get familiar with a stock data json file
Pick a json file and lets have a look what keys it provides us:
import json from pathlib import Path fp = Path.home().joinpath('Documents/journal/data/stocks/TIME_SERIES_MONTHLY-AIR.DEX.json') return json.load(fp.open('r')).keys()
import json from pathlib import Path key = 'Meta Data' fp = Path.home().joinpath('Documents/journal/data/stocks/TIME_SERIES_MONTHLY-AIR.DEX.json') return list(json.load(fp.open('r'))[key].items())
import json from pathlib import Path key = 'Monthly Time Series' fp = Path.home().joinpath('Documents/journal/data/stocks/TIME_SERIES_MONTHLY-AIR.DEX.json') return list(json.load(fp.open('r'))[key].items())[:3]
Get familiar with all json files
Let us see if all json files share the same number of entries for the key ’Monthly Time Series’:
import json from pathlib import Path key = 'Monthly Time Series' counter = {} wd = Path.home().joinpath('Documents/journal/data/stocks') # working directory for fp in wd.glob('TIME_SERIES_MONTHLY-*.json'): counter[str(fp).split('/')[-1]] = len(json.load(fp.open())[key]) return sorted([(v, k) for k, v in counter.items()], reverse=True)
The stock with the smallest set of entries is ENR.DEX, but most of the stocks shares the same number of entries: 204.
Extract time series data
The next step is to read and extract the symbol and the time series data ’Monthly Time Series’. We will extract the key ’4. close’.
import json from pathlib import Path wd = Path.home().joinpath('Documents/journal/data/stocks') fp = wd.joinpath('TIME_SERIES_MONTHLY-ENR.DEX.json') stock = json.load(fp.open('r')) symbol = stock['Meta Data']['2. Symbol'] time_series = [(el[0], el[1]['4. close']) for el in stock['Monthly Time Series'].items()] return (symbol, time_series)
('ENR.DEX', [('2022-01-21', '19.1250'), ('2021-12-30', '22.4900'), ('2021-11-30', '23.4400'), ('2021-10-29', '24.8200'), ('2021-09-30', '23.2300'), ('2021-08-31', '24.5800'), ('2021-07-30', '22.9400'), ('2021-06-30', '25.4200'), ('2021-05-31', '26.0000'), ('2021-04-30', '27.8000'), ('2021-03-31', '30.6100'), ('2021-02-26', '31.2500'), ('2021-01-29', '30.5800'), ('2020-12-30', '30.0000'), ('2020-11-30', '24.9000'), ('2020-10-30', '18.8000')])