Wednesday, July 31, 2019

Model-based and Model-free in Reinforcement Learning (RL)

I was going thru some predictive analytics reading and came across this interesting terms "Model-Based and Model-Free" and found the below excerpt from this pdf and thought of blogging it over here for my own future reference as this explained so well and easy to understand.

Reinforcement learning methods can broadly be divided into two classes, model-based and model-free.

Consider the problem illustrated in the figure below, of deciding which route to take on the way home from work on Friday evening.

We can abstract this task as having states (in this case, locations, notably of junctions), actions (e.g. going straight on or turning left or right at every intersection), probabilities of transitioning from one state to another when a certain action is taken (these transitions are not necessarily deterministic, e.g. due to road works and bypasses), and positive or negative outcomes (i.e. rewards or costs) at each transition from scenery, traffic jams, fuel consumed, etc. (which are again probabilistic).

Model-based computation, illustrated in the left ‘thought bubble’, is akin to searching a mental map (a forward model of the task) that has been learned based on previous experience. This forward model comprises knowledge of the characteristics of the task, notably, the probabilities of different transitions and different immediate outcomes. Model-based action selection proceeds by searching the mental map to work out the longrun value of each action at the current state in terms of the expected reward of the whole route home, and chooses the action that has the highest value.

Model-free action selection, by contrast, is based on learning these long-run values of actions (or a preference order between actions) without either building or searching through a model. RL provides a number of methods for doing this, in which learning is based on momentary inconsistencies between successive estimates of these values along sample trajectories. These values, sometimes called cached values because of the way they store experience, encompass all future probabilistic transitions and rewards in a single scalar number that denotes the overall future worth of an action (or its attractiveness compared with other actions). For instance, as illustrated in the right ‘thought bubble’, experience may have taught the commuter that on Friday evenings the best action at this intersection is to continue straight and avoid the freeway.

Model-free methods are clearly easier to use in terms of online decision-making; however, much trial-and-error experience is required to make the values be good estimates of future consequences. Moreover, the cached values are inherently inflexible: although hearing about an unexpected traffic jam on the radio can immediately affect action selection that is based on a forward model, the effect of the traffic jam on a cached propensity such as ‘avoid the freeway on Friday evening’ cannot be calculated without further trial-and-error learning on days in which this traffic jam occurs. Changes in the goal of behavior, as when moving to a new house, also expose the differences between the methods: whereas model-based decision making can be immediately sensitive to such a goal-shift, cached values are again slow to change appropriately. Indeed, many of us have experienced this directly in daily life after moving house. We clearly know the location of our new home, and can make our way to it by concentrating on the new route; but we can occasionally take an habitual wrong turn toward the old address if our minds wander. Such introspection, and a wealth of rigorous behavioral studies (see [15], for a review) suggests that the brain employs both model-free and model-based decision-making strategies in parallel, with each dominating in different circumstances [14]. Indeed, somewhat different neural substrates underlie each one [17].


Thursday, June 27, 2019

Connecting DBeaver with Spark Databricks Cluster


Got the Simba JDBC drivers from databricks.

Extracted the zip and then SimbaSparkJDBC41-2.6.3.1003.zip

Adding Simba driver to DBeaver:

In the DBeaver:
  1. Driver Manager
    1. Select New
                                                               i.      Give some name to Driver Name – Label only
                                                             ii.      Click on Add file and select the SimbaSparkJDBC41-2.6.3.1003.jar file
                                                           iii.      Add com.simba.spark.jdbc41.Driver to the Class Name: (Class name is as of 06/27/2019)
Getting JDBC URL from Databricks:
  1. Goto your Cluster from Databricks
  2. Click on Advanced Options in the Configuration tab
  3. Click on JDBC/ODBC tab
  4. Grab the JDBC URL provided which will look like below:
jdbc:spark://<server-name-info>:<port>/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/0/cluster-name;AuthMech=3;UID=token;PWD=<personal-access-token>

Getting the Token from Databricks:
  1. Click on the user Icon on the top right most corner and click on User Settings.
  2. Select Access Tokens and create a token. NOTE: Pay attention to the dialogue box as this token is only showed once so you save it first.

Connecting Spark thru DBeaver:
  1. Click New Database connection in DBeaver
  2. Select the driver you just added – check for the label you provided for Driver Name when you added Simba driver
  3. Copy the URL you obtained from databricks into JDBC URL:
  4. Provide the token you obtained in PWD= in the URL
jdbc:spark://<server-name-info>:<port>/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/0/cluster-name;AuthMech=3;UID=token;PWD= ##############
  1. Test Connection and you should be connected to your cluster and should be able to see all databases.

Added UseNativeQuery=1 at the end of url as I was getting errors in DBeaver:

jdbc:spark://<server-name-info>:<port>/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/0/cluster-name;AuthMech=3;UID=token;PWD=##############;UseNativeQuery=1

Thursday, May 30, 2019

Concatenated string to individual rows in Spark SQL, PG and Snowflake

I had this column named age_band which will have values like "55-64|65-74|75+"
As you can see it contains age groups stored in as a string concatenated with '|' and each age group needs to be compared separately.

I had this taken care in PostgreSQL using:


select unnest(string_to_array('55-64|65-74|75+', '|'));


which results:

# unnest
1 55-64
2 65-74
3 75+

Now, I have to perform the same in Spark SQL and here it is:


select explode(split('55-64|65-74|75+', '[|]'));


and here is the result:

col
55-64
65-74
75+

Update: 11/27/2019

And in SnowFlake:

select col1, c.value::string as age_band
  from san_test,
     lateral flatten(input=>split(col2, '|')) c;

col2 is the column where the age_band has concatenated values with | delimited. 

select c.value::string as age_band from lateral flatten(input=>split('55-64|65-74|75+', '|')) c;

AGE_BAND 55-64 65-74 75+

Monday, March 18, 2019

AWS Storage Systems Comparison - S3 vs EBS vs EFS

The table below shows the major differences (not all) between available AWS storage systems.
Note I am not including Glacier here as Glacier is purely meant for archiving and it doesn't make sense to compare with other storage systems -at least as of this day :)


Feature/Product S3 (Simple Storage Service) EBS (Elastic Block Storage) EFS (Elastic File Storage)
Accessibility Publicly accessible with key-id Only thru the associated EC2 instance Can be accessed thru more than one EC2 and AWS services
Access Control IAM
Bucket policies and User policies
IAM
Security Groups
IAM
Security Groups
Interface Web Interface File System Interface Web and File System Interface
Storage Type Object Storage Block Storage Object Storage
Cost $$ Cheaper than EBS and EFS > S3 and < EFS More expensive than EBS
Scalable Scalable Manual Scalable
Performance Slower than EBS and EFS Faster than S3 and EFS Faster than S3 and slower than EBS
Best used for Storing backups meant to be EC2 drive Sharable apps and workloads.