Skip to content

Some facts about myCaseTracker

One year later, myCaseTracker has tracked more than 3.2 Million receipts. That is over 4.2 Million status update records.  We have upto 30 agents concurrent cloud-based agents running 24/7 and are able to scan more than 30,000 records per day. Thank you for all your queries, inputs and supports, these numbers are growing rapidly everyday. 

Build a R-JSON web service using Rook and rjson

There have been lots of discussions on setting up a web server using Rook. But I haven’t found a complete sample on building a R based JSON web service. Whiling developing MyCaseTracker 2.0, I tried to implement a R-Rook-rjson setup of a RPC type web service.

Starting with setting up the web server environment under Rook.. If you want to just implement a simple localized (only bind to 127.0.0.1) web server, it is quiet easy by the following 4-step setup

> library(Rook)
> s <- Rhttpd$new()
> s$start(quiet=TRUE)
> s$print()

However, if you want to bind it to some IP address other than loopback, you will need some more work to do. Here is my example,

#server parameters
myPort <- 8000
myInterface <- "0.0.0.0"
status <- -1
# R 2.15.1 uses .Internal, but the next release of R will use a .Call.
# Either way it starts the web server.
if (as.integer(R.version[["svn rev"]]) > 59600) {
  status <- .Call(tools:::startHTTPD, myInterface, myPort)
} else {
  status <- .Internal(startHTTPD(myInterface, myPort))
}
if (status == 0) {
  unlockBinding("httpdPort", environment(tools:::startDynamicHelp))
  assign("httpdPort", myPort, environment(tools:::startDynamicHelp))
  s <- Rhttpd$new()
  s$listenAddr <- myInterface
  s$listenPort <- myPort
# Change this line to your own application. You can add more than one
# application if you like
  s$add(name = "test", app = Rook.app)
# Now make the console go to sleep. Of course the web server will still be running.
  while (TRUE) Sys.sleep(24 * 60 * 60)
}

The function Rook.app() is called everytime when a HTTP request is made.

Rook.app <- function(env) {
  request <- Request$new(env);
  response <- Response$new();
  write.initial.HTML(request,response);
  response$finish();
}

Request contains HTTP request information, you can parse GET, POST and etc to R.. Response is the output HTML to be returned.
In the sample above, Rook.app will call write.initial.HTML(request,response) to process the HTML request. You can write your JSON statements in there.

write.initial.HTML <- function(request,response, iter) {
  json_parser <- newJSONParser()
  json_parser$addData( request$GET()$p )
  rpc <- try( json_parser$getObject(), silent = TRUE );
  response$header('"Content-Type": "application/json"')
  if( class( rpc ) == "try-error" ) {
    response$write("not a valid json");
  }
  else {
    response$write(print(do.rpc(rpc)));
  }
}

You will need to load rjson library first. In the sample above, I used the GET statement and variable name is p. So your URL is something like http://webserver/custom/test?p={JSON DATA}. “/custom/test” is defined by Rook. rpc is the json parser from the URL request.

From here, I am going to use an example RPC to explain how rjson works with Rook. My function do.rpc() is follows.
# JSON processer:: do.rpc
do.rpc <- function( rpc )
{
  rpc$params <- as.list( rpc$params )
  result <- try( do.call( rpc$method, rpc$params ), silent = TRUE )
  if( class( result ) == "try-error" ) {
    #TODO JSON-RPC defines several erorrs (call not found, invalid params, and server error)
    #if a call exists but fails, I am sending a procedure not found - when really it was found
    #but had an internal error. the data contains the actual error from R
    rpc_result <- list(
      jsonrpc = "2.0",
      error = list( code = -32601, message = "Procedure not found.", data = as.character( result ) ),
      status = 500
    )
  } else {
    #RPC call suceeded
    rpc_result <- list(
      jsonrpc = "2.0",
      result = result,
      status = "success"
    );
  }
#return the JSON string
  ret <- toJSON( rpc_result );
  ret <- paste( ret, "\n", sep="" );
  return( ret );
}

In this RPC, rpc$method, rpc$params are the two parameters to be included in JSON request. You can do it using
http://webserver/custom/test?p={“method”:”myfunc”,”params”,”this is a test”}
or if you have more than one parameter to parse, you can use
http://webserver/custom/test?p={“method”:”myfunc”,”params”,[“first parameter”,”second parameter”,”third parameter”]}

In executing do.rpc(), it will call the function defined by rpc$method (“myfunc” from the sample url above). In this case, you can write as many functions and use the same do.rpc() to call.

Hope this helps you to setup your own R-Rook-JSON RPC server.

MyCaseTracker v2.0 in progress

With the launch of MyCaseTracker, I am inspired by the daily visiting volume. V2.0 is in progress with a whole bunch of new items, analytic tools and increasing engine performance. Below are the highlights:

1. Scrubbing engine will be again improved to increase performance. This includes robust detection of not-yet-opened receipt region and auto-detection of approval and denial status. 

2. Analytic tools are completely renovated. New statistics and graphs are to be produced. Survival analysis will be included, and center performance comparison will be included. 

3. The most exciting add-on is what is called “claim your receipt”. With this new tool, you are able to claim your receipt, and your family members’ receipts, link your multiple receipts (i.e., 140-765-131-485). These info will be used to create more analytics.

Stay tuned..  @myCaseTracker

Microsoft Translator Ruby Gem

Stolen from http://www.techhui.com/profiles/blogs/microsoft-translator-ruby-gem

I know “Microsoft” is not the first word that comes mind when you’re writing a ruby application but since Google dropped the free tier for their translation service the Microsoft Translator API is a good alternative for a small/personal project that you don’t want to have to bother with the monthly bill.  

 

Recently I’ve had to use this API in a project and this weekend I extracted the functionality out into a simple gem.  I present to you ‘microsoft_translator’  (queue applause) https://github.com/ikayzo/microsoft_translator

 

Before translating things from your ruby application you first need to sign up for the Microsoft Translator API in the Windows Azure Datamarket.  https://datamarket.azure.com/dataset/1899a118-d202-492c-aa16-ba21c3…

 

Don’t worry, they have a free tier! (up to 2 million translated characters/month) Once you sign up for the Translator API you will also need to register your application with the Azure Datamarket.  https://datamarket.azure.com/developer/applications/

 

Also, you shouldn’t stress about what to put for the Redirect URI. For the purposes of this gem you won’t be using it so your project’s homepage will work just fine. You’ll use the Client ID and Client secret to authenticate your requests to the API. Once this is done you’ll install it like you would any other gem…

 

First create a MicrosoftTranslator::Client with your Client ID & secret. To translate pass in the foreign text allong with the language codes for the language you are going from/to and the content type. The content type is either “text/plain” or “text/html”

translator = MicrosoftTranslator::Client.new('your_client_id', 'your_client_secret')
spanish = "hasta luego muchacha"translator.translate(spanish,"es","en","text/html")
# => "until then girl"

That’s about it! This is a list of the supported languages by the Microsoft Translate API http://www.microsofttranslator.com/help/?FORM=R5FD and here are all the language codes as a helpful reference. http://www.loc.gov/standards/iso639-2/php/code_list.php

R server switched to AWS

The dedicated server is expired today.. switch R server to Amazon EC2.  I wanted to try run the server as Spot Request at a cheaper price.

It is built under t1.Micro with Linux Unbuntu 12 64bit. R 2.15.1 is installed. Hopefully it won’t give much downtime.

Tech Overview

This overview documents the architecture of the project..

There are 3 independent modules in this project: scrubbing engine, web interface, and statistical tool:

Scrubbing engine:

this is the hardest part of the project.. Ruby/Mechanize is so far the best tool for me to scrub a webpage and Ruby/MySQL can talk to a background data storage server. At beginning, only one server with 3 processes was running, and could scan ~20,000 receipts in a single day. After bringing the site online, the number of server is increased to 4 with 18 concurrent processes, in a single day today, more than 100,000 receipts can be scanned.

Statistical Tool:

there is good and bad about R.. R is a free and easy to config statistic tool running in both Win/Unix environment. It is my best choice also because it generates pretty graphs using “ggplot2”. Current configuration is that the web server talks to a complete independent server running only R to send request and retrieve graphs. The R server also communicates with the database server to retrieve the data. It is somehow inefficient in setup (also geographic concerns), but ensures R plots FAST!

Web interface:

web design is my weakness.. PHP/MySQL is the only scripting combination I am able to program. I also need a good talent on HTML/CSS.. maybe HTML5??

A little bit history..

Initially, I wrote a tiny web scrubbing engine to scan the receipt numbers for my own benefits. I totally understand the anxiety when people like me and their families are in this long and painful waiting times. Thank to Mitbbs.com, Trackitt.com and other online forums, I received lots of useful information and help through the process, in return, I decided to convert the engine into a project to help the people just like me to gather a little more information about their cases, timing, and progress. As a statistician myself, it would also be a good use of my knowledge to practice, analyze and display the ‘big data’ statistically.

Finally, no matter you can get a little or some useful information from it, I wish you “good luck” in your applications.

-bigfacepig