Web Scraping Sourceforge

When I started writing scripts a few months ago, I quickly learned how essential programs and scripts are to giving an informational edge in the stock market. Its for that reason that I decided to scrape Sourceforge.net for a list of programs they have. As of right now this script is written to go to url “http://sourceforge.net/directory/os%3Awindows/freshness%3Arecently-updated/?page=” +i, (where i represents the page number), and scrape the contents for titles of programs, and save the results to a text file. There are some bugs that need to be worked out, but for all intensive purposes this script will do what it was designed to do, get a list of all programs hosted by SourceForge.

Requirements: Python 2.7, BeautifulSoup 3.2.1, urllib2,codecs

Copy and paste the following code into your python idle window:

from BeautifulSoup import BeautifulSoup
import urllib2
import codecs

sourceforgefile = open(“sourceforgeindex.txt”, “w+”)
sourceforgefile.close()

j=2150 #replace with any number desired

i=1
while i<j:

url = urllib2.urlopen(“http://sourceforge.net/directory/os%3Awindows/freshness%3Arecently-updated/?page=&#8221; +str(i))
content = url.read()
soup = BeautifulSoup(content)
for node in soup.findAll(“span”, {“itemprop”:”name”}):
result = ”.join(node.findAll(text=True))
sourceforgefile = codecs.open(“sourceforgeindex.txt”, “a”, “utf_8”)
sourceforgefile.writelines(result)
sourceforgefile.close()
print unicode(result)
i+=1

Map the Trends: ADX Method (01)

According to John Murphy’s Ten Laws of Technical Trading, the first rule of trading is to map the trends. In this post we will evaluate the Average Directional Index (ADX) method of determining whether a trend exists.

  • First a little background on the ADX:

“ADX is used to quantify trend strength. ADX calculations are based on a moving average of price range expansion over a given period of time. The default setting is 14 bars, although other time periods can be used. ADX can be used on any trading vehicle such as stocks, mutual funds, exchange-traded funds and futures. (For background reading, see Exploring Oscillators and Indicators: Average Directional Index and Discerning Movement With The Average Directional Index – ADX.)” Source: http://www.investopedia.com/articles/trading/07/adx-trend-indicator.asp

  • Components of ADX:
  1. Range 0-100
  2. 0-25 Absent or Weak Trend
    25-50 Strong Trend
    50-75 Very Strong Trend
    75-100 Extremely Strong Trend
  3. When Price Diverges from ADX, the trend is unconfirmed
  • Our method will be to apply the ADX indicator for identifying upward trends for Buy strategies.
  1. Scan the market for ADX crossovers above 25, implying a strengthening trend. TOS has a prebuilt scan for this.
  2. Determine whether the stock is uptrending or downtrending. In order to accomplish this I will be using 3 simple moving averages. A post including the code for the indicator will be found in the Think or Swim section of this blog.
  3. Create a watchlist for all the uptrending stocks.
  4. Analyze the trend on different timeframes

In the next post “Map the Trends: ADX Method (02)”, we will apply our technique for finding uptrending stocks, and create a watchlist. From their we will analyze John Murphy’s second Law, “Spot the Trend and Go With It”. In this stage we will attempt to locate entrance points for the stocks in our watchlist.

Technical Trading Strategies

In this section I would like to review some common technical trading strategies provided by http://www.stockcharts.com, and monitor their effectiveness at predicting the market, finding support and resistance, and determining enter and exit strategies. We will be using the John Murphy’s Ten Laws of Technical Trading also provided by http://www.stockcharts.com.

 

  • John Murphy’s Ten Laws of Technical Trading:

1)Map the Trends

2)Spot the Trend and Go With It

3)Find the Low and High of It

4)Know How Far to Backtrack

5)Draw the Line

6)Follow That Average

7)Learn the Turns

8)Know the Warning Signs

9)Trend or Not a Trend?

10)Know the Confirming Signs

For a complete breakdown of these rules please visit: http://stockcharts.com/school/doku.php?id=chart_school:trading_strategies:john_murphy_s_ten_laws

 

  • The indicators/strategies of interest include:

CCI Correction – A strategy that uses weekly CCI to dictate a trading bias and daily CCI to generate trading signals.

CVR3 VIX Market Timing – Developed by Larry Connors and Dave Landry, this is a strategy that uses overextended readings in the CBOE Volatility Index ($VIX) to generate buy and sell signals for the S&P 500.

Gap Trading Strategies – Various strategies for trading based on opening price gaps.

Ichimoku Cloud – A strategy that uses the Ichimoku Cloud to set the trading bias, identify corrections and signal short-term turning points.

Last Stochastic Technique – A simple trading system based on a special version of the Stochastic Oscillator.

Moving Momentum – A strategy that uses a three step process to identify the trend, wait for corrections within that trend and then identify reversals that signal a end to the correction.

Narrow Range Day NR7 – Developed by Tony Crabel, the narrow range day strategy looks for range contractions to predict range expansions. Advance scan code included that tweaks this strategy by adding Aroon and CCI qualifiers.

Percent Above 50-day SMA – A strategy that uses the breadth indicator, percent above the 50-day moving average, to define the tone for the broad market and identify corrections.
Pre-Holiday Effect – How the market has performed prior to major US holidays and how that can affect trading decisions.

RSI2 – An overview of Larry Connors’ mean reversion strategy using 2-period RSI.

Sector Rotation Based on Performance – Based on research from Mebane Faber, this sector rotation strategy buys the top performing sectors and re-balances once per month.

Six Month Cycle MACD – Developed by Sy Harding, this strategy combines the six month bull-bear cycle with MACD signals for timing.

Stochastic Pop and Drop – Developed by Jake Berstein and modified by David Steckler, this strategy uses the Average Directional Index (ADX) and Stochastic Oscillator to identify price pops and breakouts.

Slope Performance Trend – Using the slope indicator to quantify the long term trend and measure relative performance for use in a trading strategy with the nine sector SPDRs.

Swing Charting – What Swing Trading is and how it can be used to profits under certain market conditions.

Trend Quantification and Asset Allocation – This article shows chartists how to define long-term trend reversals as a process by smoothing the price data with four different Percentage Price Oscillators. Chartists can also use this technique to quantify trend strength and determine asset allocation.

For a complete breakdown of these strategies please visit:  http://stockcharts.com/school/doku.php?id=chart_school:trading_strategies

Purpose Statement

When trading the market one of the most important first steps is to filter out the noise. This blog is presented in such a way to accommodate the writer, however readers are welcome. The method will be to use this blog as a place to keep overall thoughts organized, including research, trading ideas, data mining programs, and market scanners.

I will be using Python 2.7 alongside with HTML parser BeautfiulSoup to create data mining programs to scan and filter various data. To do so will increase the effectiveness of filtering out noise to make quality investment decisions.

For building market scanners, I will be using Think or Swim’s coding platform.

The focus of my research will be geared towards stocks trading under $5.00 a share.

OTCMarkets.com Screen Scraper Using Python and Regular Expressions – Part 1

Purpose: Scan the OTC market for companies with a market cap under $1 million. Tiers = [OTCQB, OTCBB,Pink Sheet]

Notes:

This skeleton template is not complete. However the script will scrape our desired url for market values of length 7 digits only, using a list of symbols that are free to change at any point. The regex after the | symbol finds all dates, with the intention of finding the last date the market cap was updated. The desired date is the first one. So in part two we will refine the regex to print only results for market caps <= $1,000,000 and only the first instance of dates.

Tools: Python 2.7.5 , Regular Expressions , urllib

Code:

import re

import urllib

NewSymbolsList = [“GOFF”,”GOOG”,”SWVI”]
regex = ‘(<td>\$\d{1}\,\d{3}\,\d{3}</td>)|(<td>a\/o(.+?)</td>)’
pattern = re.compile(regex)

i = 0

while i<len(NewSymbolsList):

url = “http://www.otcmarkets.com/stock/&#8221; +NewSymbolsList[i] +”/company-info”

htmlfile = urllib.urlopen(url)

htmltext = htmlfile.read()

marketcap = re.findall(pattern,htmltext)

print str(marketcap)

i+=1

 

Simple Moving Average (SMA) Scanner

As I am still familiarizing myself with the TOS syntax, I created a simple scanner that searches for stocks that may be in a reversal.

The criteria:

close >= 5 day SMA

close >= 10 day SMA

5 day SMA >=  10 day SMA.

volume > 100000

close <= highest(high[10], length = 10)

high <= high[10]

high<low[20]

Final Script:

close>=simplemovingavg(“length” = 5) and close>= simplemovingavg(“length” = 10) and simplemovingavg(“length” = 5) >= simplemovingavg(“length” = 10) and volume>100000 and close<=Highest(“data” = high[10], “length” = 10) and high<high[10] and high<low[20]