Browse Source

Merge pull request #952 from hamiltont/travis-pr

Update travis-ci to work with github Pull Requests
Hamilton Turner 11 years ago
parent
commit
5563b603d1
5 changed files with 214 additions and 30 deletions
  1. 14 13
      .travis.yml
  2. 150 0
      toolset/README.md
  3. 47 15
      toolset/run-ci.py
  4. 2 1
      toolset/setup/linux/bash_functions.sh
  5. 1 1
      toolset/setup/linux/languages/mono.sh

+ 14 - 13
.travis.yml

@@ -22,7 +22,6 @@ env:
     # Put this first as Travis-CI builds top to bottom
     - TESTDIR=jobcleaner
 
-
     # Group tests by directory to logically break up travis-CI build. Otherwise
     # we end up starting ~200+ different workers. Seems that ~100 is the limit
     # before their website starts to lag heavily
@@ -33,7 +32,6 @@ env:
     # done    
     - TESTDIR=activeweb
     - TESTDIR=aspnet
-    - TESTDIR=aspnet-stripped
     - TESTDIR=beego
     - TESTDIR=bottle
     - TESTDIR=cake
@@ -157,7 +155,9 @@ before_install:
   - sudo apt-get install openssh-server
   
   # Needed to cancel build jobs from run-ci.py
-  - time gem install travis -v 1.6.16 --no-rdoc --no-ri 
+  # Only install travis command line if this is not a pull request
+  # as it takes a long time and is not used for pull requests
+  - '[ "${TRAVIS_PULL_REQUEST}" = "false" ] && time gem install travis -v 1.6.16 --no-rdoc --no-ri || true'
  
   # Run as travis use (who has passwordless sudo)
   - ssh-keygen -f /home/travis/.ssh/id_rsa -N '' -t rsa  
@@ -172,20 +172,21 @@ before_install:
   # Doesn't work yet
   # - alias run_tfb="coverage run --parallel-mode --omit installs,results"
 install:
-  - time pip install coveralls
-  
   # Install server prerequisites
   - time ./toolset/run-ci.py prereq $TESTDIR
 
-  # Add commit diff to the logs
-  - echo $TRAVIS_COMMIT_RANGE
-  - git diff --name-only $TRAVIS_COMMIT_RANGE
 script: 
-  
   # Run test verification 
   - time ./toolset/run-ci.py test $TESTDIR
 
-after_success:
-  - coverage combine
-  - coverage report
-  - coveralls
+notifications:
+  irc:
+    channels:
+      - "chat.freenode.net#techempower-fwbm"
+    template:
+      - "%{repository_slug} branch '%{branch}' %{result} (%{author})"
+      - "Build %{build_number} took %{duration}. See %{build_url}"
+    on_success: always
+    on_failure: always
+    skip_join: true
+

+ 150 - 0
toolset/README.md

@@ -0,0 +1,150 @@
+# TFB Toolset 
+
+This directory contains the code that TFB uses to automate installing, 
+launching, load testing, and terminating each framework. 
+
+## Travis Integration
+
+This section details how 
+[TFB](https://github.com/TechEmpower/FrameworkBenchmarks) 
+integrates with 
+[Travis Continuous Integration](https://travis-ci.org/TechEmpower/FrameworkBenchmarks). At a 
+high level, there is a github hook that notifies travis-ci every 
+time new commits are pushed to master, or every time new commits 
+are pushed to a pull request. Each push causes travis to launch a 
+virtual machine, checkout the code, run an installation, and run
+a verification.
+
+[Travis-ci.org](https://travis-ci.org/) is a free 
+([pro available](https://travis-ci.com/)) service, and we have a limited 
+number of virtual machines available. If you are pushing one 
+commit, consider including `[ci skip]` *anywhere* in the commit 
+message if you don't need Travis. If you are pushing many commits, 
+use `[ci skip]` in *all* of the commit messages to disable Travis. 
+
+### Travis Terminology
+
+Each push to github triggers a new travis *build*. Each *build* 
+contains a number of independent *jobs*. Each *job* is run on an
+isolated virtual machine called a *worker*. 
+
+Our project has one *job* for each framework directory, e.g. 
+one *job* for `go`, one *job* for `activeweb`, etc. Each 
+*job* gets it's own *worker* virtual machine that runs the 
+installation for that framework (using `--install server`) and 
+verifies the framework's output using `--mode verify`. 
+
+The *.travis.yml* file specifies the *build matrix*, which is 
+the set of *jobs* that should be run for each *build*. Our 
+*build matrix* lists each framework directory, which causes 
+each *build* to have one *job* for each listed directory. 
+
+### Travis Limits
+
+[Travis-ci.org](https://travis-ci.org/) is a free 
+([pro available](https://travis-ci.com/)) service, and therefore imposes 
+multiple limits. 
+
+Each time someone pushes new commits to master (or to a pull request), 
+a new *build* is triggered that contains ~100 *jobs*, one for each 
+framework directory. This obviously is resource intensive, so it is 
+critical to understand travis limits. 
+
+**Minutes Per Job**: `50 minutes` maxiumum. None of the *job*s we run hit 
+this limit (most take 10-15 minutes total)
+
+**Max Concurrent Jobs**: `4 jobs`, but can increase to 10 if Travis has low 
+usage. This is our main limiting factor, as each *build* causes ~100 *jobs*.
+Discussed below
+
+**Min Console Output**: If `10 minutes` pass with no output to stdout or stderr, 
+Travis considers the *job* as errored and halts it. This affects some of our
+larger tests that perform part of their installation inside of their `setup.py`. 
+Discussed below
+
+**Max Console Output**: A *job* can only ouput `4MB` of log data before it 
+is terminated by Travis. Some of our larger builds (e.g. `aspnet`) run into 
+this limit, but most do not
+
+### Dealing with Travis' Limits
+
+**Max Concurrent Jobs**: Basically, we cancel any unneeded jobs. Practically,
+canceling is entirely handled by `run-ci.py`. If needed, the TechEmpower team
+can manually cancel *jobs* (or *builds*) directly from the Travis website. 
+Every *build* queues every *job*, there is no way to not queue *jobs*
+we don't need, so the only solution is to cancel the unneeded jobs. 
+
+**Min Console Output**: Some frameworks run part of their installation 
+inside of their `setup.py`'s start method, meaning that all output goes into 
+the `out.txt` file for that test. The TFB toolset needs to be updated to 
+occasionally trigger some output, although this is a non-trivial change for a 
+few reasons. If your framework is erroring in this way, consider attempting to 
+run your installation from the `install.sh` file, which avoids this issue. 
+
+### Advanced Travis Details
+
+#### The Run-Continuous Integration (e.g. run-ci.py) Script
+
+`run-ci.py` is the main script for each *job*. While `run-ci.py` calls 
+`run-test.py` to do any actual work, it first checks if there is any 
+reason to run a verfication for this framework. This check uses `git diff`
+to list the files that have been modified. If files relevant to this 
+framwork have not been modified, then `run-ci.py` doesn't bother running 
+the installation (or the verification) as nothing will have changed from 
+the last build. We call this a **partial verification**, and if only one 
+framework has been modified then the entire build will complete within 
+10-15 minutes. 
+
+*However, if anything in the `toolset/` directory has been modified, then
+every framework is affected and no jobs will be cancelled!* We call this 
+a **full verification**, and the entire build will complete within 4-5 hours. 
+
+In order to cancel Travis *jobs*, `run-ci.py` uses the [Travis Command Line
+Interface](https://github.com/travis-ci/travis.rb). Only TechEmpower admins
+have permission to cancel *jobs* on 
+[TechEmpower's Travis Account](https://travis-ci.org/TechEmpower/FrameworkBenchmarks/builds/31771076), 
+so `run-ci.py` uses an authentication token to log into Github (and therefore
+Travis) as a TechEmpower employee, and then cancels *jobs* as needed. 
+
+#### The 'jobcleaner' 
+
+Because we have so many *jobs*, launching *workers* just to have them be 
+cancelled can take quite a while. `jobcleaner` is a special job listed first
+in the *build matrix*. `run-ci.py` notices the `jobcleaner` keyword and 
+attempts to cancel any unnecessary *jobs* before the *workers* are even 
+launched. In practice this is quite effective - without `jobcleaner` a 
+partial verification takes >1 hour, but with `jobcleaner` a partial 
+verification can take as little as 10 minutes.  
+
+This takes advantage of the fact that Travis currently runs the 
+*build matrix* roughly top to bottom, so `jobcleaner` is triggered early 
+in the build. 
+
+#### Pull Requests vs Commits To Master
+
+When verifying code from a pull request, `run-ci.py` cannot cancel any 
+builds due to a Travis [security restriction](http://docs.travis-ci.com/user/pull-requests/#Security-Restrictions-when-testing-Pull-Requests) 
+([more details](https://github.com/TechEmpower/FrameworkBenchmarks/issues/951)). 
+
+Therefore, `run-ci.py` returns `pass` for any *job* that it would normally 
+cancel. *Jobs* that would not be canceled are run as normal. The final 
+status for the verification of the pull request will therefore depend on the 
+exit status of the *jobs* that are run as normal - if they return `pass` then
+the entire build will `pass`, and similarly for fail. 
+
+For example, if files inside `aspnet/` are modified as part of a pull request, 
+then every *job* but `aspnet` is guaranteed to return `pass`. The return code 
+of the `aspnet` job will then determine the final exit status of the build. 
+
+#### Running Travis in a Fork
+
+A Travis account specific to your fork of TFB is highly valuable, as you have 
+personal limits on *workers* and can therefore see results from Travis much 
+more quickly than you could when the Travis account for TechEmpower has a 
+full queue awaiting verification. 
+
+You will need to modify the `.travis.yml` file to contain your own (encrypted)
+`GH_TOKEN` environment variable. Unfortunately there is no way to externalize 
+encrypted variables, and therefore you will have to manually ensure that you 
+don't include your changes to `.travis.yml` in any pull request or commit to 
+master!

+ 47 - 15
toolset/run-ci.py

@@ -34,24 +34,43 @@ class CIRunnner:
     '''
 
     logging.basicConfig(level=logging.INFO)
-
-    try:
-      self.commit_range = os.environ['TRAVIS_COMMIT_RANGE']
-    except KeyError:
-      log.warning("Run-ci.py should only be used for automated integration tests")
-      last_commit = subprocess.check_output("git rev-parse HEAD^", shell=True).rstrip('\n')
-      self.commit_range = "master...%s" % last_commit
     
     if not test_directory == 'jobcleaner':
       tests = self.gather_tests()
       
-      # Only run the first test in this directory
-      self.test = [t for t in tests if t.directory == test_directory][0]
+      # Run the first linux-only test in this directory
+      dirtests = [t for t in tests if t.directory == test_directory]
+      validtests = [t for t in dirtests if t.os.lower() == "linux"
+                    and t.database_os.lower() == "linux"]
+      log.info("Found %s tests (%s valid) in directory %s", 
+        len(dirtests), len(validtests), test_directory)
+      if len(validtests) == 0:
+        log.critical("Found No Valid Tests, Aborting!")
+        sys.exit(1)
+      self.test = validtests[0]
       self.name = self.test.name
+      log.info("Choosing to run test %s in %s", self.name, test_directory)
 
     self.mode = mode
     self.travis = Travis()
 
+    try:
+      # See http://git.io/hs_qRQ
+      #   TRAVIS_COMMIT_RANGE is empty for pull requests
+      if self.travis.is_pull_req:
+        self.commit_range = "%s..FETCH_HEAD" % os.environ['TRAVIS_BRANCH'].rstrip('\n')
+      else:  
+        self.commit_range = os.environ['TRAVIS_COMMIT_RANGE']
+    except KeyError:
+      log.warning("Run-ci.py should only be used for automated integration tests")
+      last_commit = subprocess.check_output("git rev-parse HEAD^", shell=True).rstrip('\n')
+      self.commit_range = "master...%s" % last_commit
+
+    log.info("Using commit range %s", self.commit_range)
+    log.info("Running `git diff --name-only %s`" % self.commit_range)
+    changes = subprocess.check_output("git diff --name-only %s" % self.commit_range, shell=True)
+    log.info(changes)
+
   def _should_run(self):
     ''' 
     Decides if the current framework test should be tested or if we can cancel it.
@@ -89,11 +108,8 @@ class CIRunnner:
       return 0
 
     log.info("Running %s for %s", self.mode, self.name)
-
-    # Use coverage so we can send code coverate to coveralls.io
-    command = "coverage run --source toolset,%s --parallel-mode " % self.test.directory
     
-    command = command + 'toolset/run-tests.py '
+    command = 'toolset/run-tests.py '
     if mode == 'prereq':
       command = command + "--install server --test ''"
     elif mode == 'install':
@@ -193,16 +209,28 @@ class CIRunnner:
 class Travis():
   '''Integrates the travis-ci build environment and the travis command line'''
   def __init__(self):     
-    self.token = os.environ['GH_TOKEN']
     self.jobid = os.environ['TRAVIS_JOB_NUMBER']
     self.buildid = os.environ['TRAVIS_BUILD_NUMBER']
-    self._login()
+    self.is_pull_req = "false" != os.environ['TRAVIS_PULL_REQUEST']
+
+    # If this is a PR, we cannot access the secure variable 
+    # GH_TOKEN, and instead must return success for all jobs
+    if not self.is_pull_req:
+      self.token = os.environ['GH_TOKEN']
+      self._login()
+    else:
+      log.info("Pull Request Detected. Non-necessary jobs will return pass instead of being canceled")
 
   def _login(self):
     subprocess.check_call("travis login --skip-version-check --no-interactive --github-token %s" % self.token, shell=True)
     log.info("Logged into travis") # NEVER PRINT OUTPUT, GH_TOKEN MIGHT BE REVEALED
 
   def cancel(self, job):
+    # If this is a pull request, we cannot interact with the CLI
+    if self.is_pull_req:
+      log.info("Thread %s: Return pass for job %s", threading.current_thread().name, job)
+      return
+
     # Ignore errors in case job is already cancelled
     try:
       subprocess.check_call("travis cancel %s --skip-version-check --no-interactive" % job, shell=True)
@@ -214,6 +242,10 @@ class Travis():
       subprocess.call("travis cancel %s --skip-version-check --no-interactive" % job, shell=True)
 
   def build_details(self):
+    # If this is a pull request, we cannot interact with the CLI
+    if self.is_pull_req:
+      return "No details available"
+
     build = subprocess.check_output("travis show %s --skip-version-check" % self.buildid, shell=True)
     return build
 

+ 2 - 1
toolset/setup/linux/bash_functions.sh

@@ -12,7 +12,8 @@ fw_get () {
 }
 
 fw_untar() {
-  tar xvf "$@"
+  echo "Running 'tar xf $@'...please wait"
+  tar xf "$@"
 }
 
 fw_unzip() {

+ 1 - 1
toolset/setup/linux/languages/mono.sh

@@ -8,7 +8,7 @@ RETCODE=$(fw_exists mono-3.2.8)
   return 0; }
 
 fw_get http://download.mono-project.com/sources/mono/mono-3.2.8.tar.bz2 -O mono-3.2.8.tar.bz2
-tar vxf mono-3.2.8.tar.bz2
+fw_untar mono-3.2.8.tar.bz2
 cd mono-3.2.8 
 ./configure --disable-nls --prefix=/usr/local
 make get-monolite-latest