top of page
Search
  • mattostanik

How to determine probability of stock trades

Updated: Feb 15, 2019


Stock market data is a good example to illustrate some of the tools and approaches that can be used for a range of business and financial analysis projects. Stock market data is plentiful, and when used successfully, can lead to a direct practical result (making money). I have spent much of my time in the past ten years analyzing business data for marketing, sales and customer relationships, but more recently, I have found stock market data to be equally intriguing. In this example I will look at using Python and a PostgreSQL database to calculate stock probabilities.


Note: this article is an example of data analysis and is not investment advice. It also is overly simple but gives a good sense of the tools and possibilities. Many stock market probability analysis projects result in probabilities of roughly 50/50.


Acquiring market data


There are a variety of sources for public market data. Some like Yahoo Finance are free, and others like Intrinio are paid. Recently I have been working with Interactive Brokers because they offer Python support via their API which makes it easy to download real-time and historic market data and easy to place trades if desired.


To use Interactive Brokers, you need to first create an account and go through their approval process. Then you can install their TWS and IB Gateway applications on your computer. IB Gateway is a tool that simply allows your Python application to connect to their API. Technically your custom application calls the IB Gateway program, and IB Gateway manages the actual API connection for security and performance reasons.


Finishing the install


After you have installed the IB Gateway, you will also need to move files on your system. Go to the IBJts in the package you downloaded from IB. Move the ibapi folder plus the setup file to your Python site-packages folder. On a Mac the folder is located here:


/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/


Note that at the time I wrote this, the IB API is not yet compatible with Python 3.7. I discovered this the hard way by first trying to install it with Python 3.7 and running into weird errors. The solution was to install Python 3.6.7 and then install the API components with that version. The version compatibility may change in the future, but if you run into odd errors, I’d suggest looking into this further.


Finally, I recommend using the ib_insync package with Python to provide a more elegant wrapper about the IB API interactions. The ib_insync package is available here. The rest of this article uses examples assuming that you have installed ib_insync.


Retrieving historic market data

To keep this example simple, let’s focus on a single stock. I will pick Tableau because their ticker symbol is ‘DATA’. Create a table in your database to save the price data, let’s call it ‘prices’. My prices table has three columns: date, time and price. Then you can write a simple Python script that connects to IB and retrieves historic data for the DATA stock, then saves it to your table:


import psycopg2

from ib_insync import *


# connect to IB

global ib

ib = IB()

ib.connect('127.0.0.1', 4002, clientId=2)


stock = Stock('DATA','SMART','USD')

bars = ib.reqHistoricalData(stock, endDateTime='20190213 15:00:00', durationStr='30 D', barSizeSetting='2 hours', whatToShow='TRADES', useRTH=True)


print(bars[0].close)


for bar in bars:

print(str(bar.date)[0:10],str(bar.date)[11:19],str(bar.open),str(bar.close))


# connect to database

conn = psycopg2.connect("dbname=name user=username password=yourpassword")

cur = conn.cursor()


for bar in bars:

SQL = 'INSERT INTO price_test (date,time,price) VALUES (\'' + str(bar.date)[0:10] + '\',\'' + str(bar.date)[11:19] + '\',' + str(bar.open) + ');'

cur.execute(SQL)

conn.commit()


if str(bar.date)[11:19] == '14:00:00':

SQL = 'INSERT INTO price_test (date,time,price) VALUES (\'' + str(bar.date)[0:10] + '\',\'15:00:00\',' + str(bar.close) + ');'

cur.execute(SQL)

conn.commit()


Let’s talk about what this code does. First it connects to IB. In the ib.connect parameters '4002' is your paper trading (simulated) account. Then it defines the stock to retrieve data for. The key request is ib.reqHistoricalData. The parameters for this request define how much data you are doing to get over what time period:

  • endDateTime is the last bar you want to retrieve (in this example, at market close (Central time) on February 13)

  • durationStr is how many hours or days of data (in this example, the previous 30 trading days)

  • barSizeSetting is what time interval to use (in this example, 2 hours)


It then loops through the results. First it prints them to the screen so you can see them. In my example, the first 10 characters of bar.date are the date and the next characters are the time. Then bar.open is the open price and bar.close is the close price for that time interval.


I then have a second loop that writes the bar data to my database table. Because I am doing 2 hour bars, my last bar each day starts at 14:00 Central. To get the market close price, whenever my loop encounters a 14:00 bar, it takes the close price from that time and saves it as a separate price at the 15:00 time.


I am using the psycopg2 package for my PostgreSQL database connection from Python. This would need customized with your database name, username and password. There are a number of options for how to connect to your database, but I typically use psycopg2.


Writing a probability query

You can then write a SQL query to look at probabilities in the pricing data you have saved. For example, the following query tells you the likelihood that the price closes up for the day:


SELECT 100*count(1)/(SELECT count(1) FROM price_test WHERE time = '15:00') FROM price_test t1 WHERE time = '15:00' AND price > (SELECT price FROM price_test t2 WHERE t2.date=t1.date AND time = '08:30');

In this time period, the probability is 50%.


Or you can check if the price is up at 10:00am, what the is the likelihood that the price will go up further by end of the day?


SELECT 100*count(1)/(SELECT count(1) FROM price_test t1 WHERE time = '10:00' AND price > (SELECT price FROM price_test t2 WHERE t2.date=t1.date AND time = '08:30')) FROM price_test t1 WHERE time = '15:00' AND price > (SELECT price FROM price_test t2 WHERE t2.date=t1.date AND time = '10:00') AND (SELECT price FROM price_test t3 WHERE t3.date=t1.date AND time = '10:00') > (SELECT price FROM price_test t4 WHERE t4.date=t1.date AND time = '08:30');

In this time period, the answer is 43%.


Building a forward-looking projection

You can start to use these building blocks to create a forward-looking projection. For example, if you know the probability that the stock will go up by end of the day, based on what the stock has done over the X days what is the likelihood it will go up tomorrow? Or if it is up/down at a certain time of the day, what is the likelihood of what it will do by end of the day?


The details of building the projection are best saved for a future article, but hopefully this gives you a sense of where to start at and some of the tools that are available for this work. If you’d like to learn more, feel free to contact me.

91 views0 comments

Recent Posts

See All

© 2020 All Rights Reserved

Proudly created with wix.com

bottom of page