My Ambitions as a Blogger: 2015

Thursday, March 19, 2015

Twitter Sentiment Analysis in R

This has been one of the hardest projects I've taken on.

I've never been asked to do this. It's just for fun, but it was challenging. I'm used to coding in Java, and since I figured using R might help me in the long-run, it would be nice to able to do some things worth mentioning.

With that being said, I got interested in using Twitter's API to analyze tweets, and I ultimately came across this YouTube video on Twitter Mining and Sentiment Analysis.

I liked the video. Michael Herman admitted he wasn't very experienced with R at the time he made his video, but he still managed to execute the code and get sentiment analysis off of the tweets he extracted. This particular method of sentiment analysis seems to be widely used as far as R-Twitter tutorials are concerned. There are a few problems though. One, this video was published on 2012, which is important to note because there may have been some changes in the R versions and Twitter API between now and then. Two, the R-code for the Sentiment Scores function can use a few tweeks, if not a complete re-do; and three, there's not a lot of channels that have an updated version of this tutorial.

The sentiment scores function (or method) is just a R-Programming method someone made to help the user with sentiment analysis, a process aimed at discerning the widespread opinion or sentiment on any given topic, idea, product or person. The scores function helps us use the group of words you're interested in analyzing in a quantifiable way, identifying and categorizing the opinions (often numerically), especially in order to determine whether the someones attitude towards a particular topic, product, etc., is positive (greater than zero) , negative (less than zero), or neutral (zero).

I've been looking around for a good, functional code that would help repeat what Herman did in his video, because I came across some problems. It wasn't easy, but after playing around with the code and doing a lot of searches (because I still consider myself a beginner with R), I successfully analysed the Twitter data. You should watch the video to see where the changes were made.

Here is my R code for the Sentiment Analysis:

Importing the data

The code provided in the video is outdated and thus will not work because Twitter changed the way a user can access its API. Now you're going to need authentication in order to grab tweets. In order to get that authentication, you're going to need to create a Twitter app (it's pretty easy). The authentication comes in the form of keys and tokens. Luckily, the twitteR package has been updated to accommodate the change.

 library(twitteR); library(plyr)

 setup_twitter_oauth("API key", "API secret", "Access Token", "Access Token")

Like the example in the video, we will search for '#abortion', but I instead of searching for 1500 tweets, I will search for 200 because (1) R can take a while just to grab 30 tweets, and (2) I don't feel like waiting too long to do this; I just want to show that the code works.

tweets = searchTwitter("abortion", n=200) 

length(tweets)

The Algorithm

The next thing to do, if you havent already, is to download the documents that Michael mentioned in the video. I did this, and it turned out to be messy. I don't know why, but when I went to the links, the text files I downloaded had an html format to it. Incredibly complicated the process. Instead, I used this link provided by Bing Liu, which gives you a rar file containing the two files you need. These files are just lists of positive (positive-words.txt) and negative (negative-words.txt) words. We will need them to match the list of tweets with the words from both files.

To make this easier to call in R, try and save or place them in the same folder that placed your R directory. For instance, you can just set your desktop to be the directory (or the place R expects to easily call files from), and then you wont have to worry about about using the exact file's location, because R already knows where the file is (your directory).

Now, the meaty part of this tutorial is the score.sentiment() function. This was what actually gave me the most problems, because if this isn't right, your analysis (for this tutorial) isn't going anywhere. I've checked with Michael's attempt (the code I saw circling around the internet the most), and I've tried Silvia Planella's updated version of the code; both times I encountered errors, and they were incredibly frustrating. With Michael's code, I got confused because there were no words I wished to exclude, but the function appeared to have required that (with the 'exc.words' input of the function), and at the same time, Michael didn't need to provide any, while R wouldn't let me continue unless I did. Planella's version of the code removed the 'exc.words' part of the function, which eleminates that problem, but the code never accounted for characters that R cannot recognized.

For instance, here's one of the tweet messages I had extracted: 'Awarded â‚¬2,000 & incited change. 2 yrs later #abortion was legalized in Portugal í ½í¹Œ @Vesselthefilm @BostonDoulas #reprojustice'. When I try to process the messages in the score.sentiment() function, the entire process would provide an error that looked like this:

 Error during wrapup: invalid input 'Awarded â‚¬2,000 &amp; incited change. 2 yrs later #abortion was legalized in Portugal í ½í¹Œ @Vesselthefilm @BostonDoulas #reprojustice' in 'utf8towcs'

If you're not used to this kind of thing, then I bet you're gonna look at it like I did...heck, maybe experts have problems with this from time to time.

The problem is some characters are not valid, so if you come across this problem, you got to find a way to exclude the invalid characters from the analysis, either from within R or before importing the files for processing.

Thankfully, I found someone with valuable experience for this situation: Gaston Sanchez. From his blog post, I learned that the error I encountered occurred when these unrecognized characters are passing through the tolower() function, which was used in the sentiment score function, so I updated the code by adding a try-catch function to account for these potential errors; problem fixed.

 score.sentiment = function(sentences, pos.words, neg.words, .progress='none')
{
  require(plyr)
  require(stringr)
  
  # we got a vector of sentences. plyr will handle a list
  # or a vector as an "l" for us
  # we want a simple array of scores back, so we use
  # "l" + "a" + "ply" = "laply":
  scores = laply(sentences, function(sentence, pos.words, neg.words) {
    
    # clean up sentences with R's regex-driven global substitute, gsub():
    sentence = gsub('[[:punct:]]', '', sentence)
    sentence = gsub('[[:cntrl:]]', '', sentence)
    sentence = gsub('\\d+', '', sentence)
    
    # convert to lower case:
    # Instead of a regular tolower function, make a try-catch function
    tryTolower = function(sentence)
    {
      # create missing value
      # this is where the returned value will be
      y = NA
      # tryCatch error
      try_error = tryCatch(tolower(sentence), error = function(e) e)
      # if not an error
      if (!inherits(try_error, "error"))
        y = tolower(sentence)
      return(y)
    }
    sentence = tryTolower(sentence)
    
    # split into words. str_split is in the stringr package
    word.list = str_split(sentence, '\\s+')
    # sometimes a list() is one level of hierarchy too much
    words = unlist(word.list)
    
    # compare our words to the dictionaries of positive & negative terms
    pos.matches = match(words, pos.words)
    neg.matches = match(words, neg.words)
    
    # match() returns the position of the matched term or NA
    # we just want a TRUE/FALSE:
    pos.matches = !is.na(pos.matches)
    neg.matches = !is.na(neg.matches)
    
    # and conveniently enough, TRUE/FALSE will be treated as 1/0 by sum():
    score = sum(pos.matches) - sum(neg.matches)
    
    return(score)
  }, pos.words, neg.words, .progress=.progress )
  
  scores.df = data.frame(score=scores, text=sentences)
  return(scores.df)
}

The score.sentiment() function returns tabular data with multiple columns and multiple rows. In R, the data.frame is the workhorse for such spreadsheet-like data.

Subsequent Analyses

After you use the function I provided, you should be able to get similar results and the same functionality as it is in the video. Cheers.

 analysis = score.sentiment(tweets.text, pos, neg, .progress="text")

> table(analysis$score)

-3 -2 -1  0  1  2  3 
 1  5 86 54 31 21  2

> median(analysis$score)
[1] 0

> mean(analysis$score)
[1] -0.1

> hist(analysis$score)

I think there is more that can be done with the sentiment analysis, but right now this is good enough. I checked out this Villanova University paper and it provided a neat template for a sentiment analysis function. Maybe I will be able to contribute to this someday.

Learn More:

Through an economic lens: the nexus between migration and the human trafficking industry

Human Trafficking can easily be confused with migrant smuggling, but they're not the same; if anything, the difference is important. Migrant smuggling is a commercial service that normally occurs with the consent of migrants, and from illegal migration, which does not typically involve any forms of exploitation, whereas human trafficking is a situation in which an individual travelling abroad was locked and forced to work for no or little pay via means of coercion.

However, because they aren't the same doesn't mean there isn't a close link between the two topics. Yes, factors such as legislation, law enforcement, ethnic discrimination, corruption, and insufficient education are widely considered significant drivers of human trafficking, but leaving out migration in that analysis may prove detrimental to the counter-trafficking efforts. This is because one of the ways millions of people get exploited is through migration efforts, especially if they live in less developed countries. Their willingness to depart and accept risks in the migration process makes them prime candidates to be exploited by criminal agents, who benefit from the enormous information asymmetries involved.

Socioeconomic conditions

The economic situation of people in these poorer regions of the world (like Mauritania, certain areas in Thailand, Pakistan, etc.) is a fundamental breeding ground for trafficking and exploitation, which may end up pushing vulnerable people to emigrate and seek better opportunities abroad. We're talking about a supply of potential victims and criminal actors fostered from suffering high unemployment, low wages and poor institutions. These factors have fostered the emergence of shadow industries offering migration services such as border crossings and illegal work abroad, such as domestic house maids, prostitutes, etc. For residents of low- and even middle-income countries, the large potential gains from migration, combined with network effects have generated an unprecedented push for legal migration to richer countries. However, with most middle and high-income country labor markets open primarily to the domestics, there are very limited legal working opportunities available for the foreign workers. So while, for them, there are less opportunities in the more desirable sectors, which tend to be reserved for people with better education; the demand for prostitutes and cheap manual workers in both high-income and low-income countries remain constant and high.

People as "commodities"

Slavery is a life course event: when one is enslaved, there is no fundamental endpoint for when it might end for him or her. The only way to guarantee when slavery ends for a person is determined by his or her lifespan, and to criminal actors, that lifespan is tied to the slave's value. As one researcher put it, “people are a good commodity as they do not easily perish, but they can be transported over long distances and can be re-used and re-sold”. This is consistent with the thought that for more people than we give credit, a slave can be a victim of not just one, but many kinds of trafficking.

Incentive to Shut up

What tends to put the victims in a trap is when their environments make it hard for them to escape. Sometimes the government or police in the area are so corrupt, a victim can escape and be sent back to the place he or she fled from, and these kinds of dilemmas can discourage victims from denouncing their traffickers. For victims illegally residing in other countries, their dilemma , since doing so puts them at great risk of getting deported and the potential legal consequences with authorities in their home countries. In economic terms, they tend to find the costs of ratting on the traffickers greater than the benefits.

Labor Supply: mostly voluntary

This is not the case for all, but most of the victims of trafficking depart to another country through voluntary means. Based on scarce scholarly input, two of these voluntary examples were modeled. These models are consistent with with Bales' writings on ways people can get trafficked these days.

One model describes the interaction between trafficking and migration, where potential victims pay a smuggler to help them cross the borders. In this situation where once migrants depart, it depends on the smuggler’s decision and the profitability of exploitation whether the potential victim end up being trafficked or not, which might have something to say about the kinds of persons that tend to get trafficked in a given country. A different kind of model looks at illegal migration markets with debt and labor contracts. Most migrants cannot pay for migration costs in advance, so criminal intermediaries and smugglers tend to offer loans to potential victims, which they have to pay back once they work in the destination country. The enforcement of these contracts occur in the criminal sectors, which is reasonable: these contracts in themselves aren't legal.

In speculation, this can be really bad for potential victims, because if the lender provides money for them to travel, chances are they will supply the migrant a job, which may likely be within the lender's network, where the criminal actor can monitor and control the victim's movements. Based on the cases in Brazil and in Pakistan, there is a good reason to think that a slave won't get paid enough money to pay off the debt in a reasonable period of time, assuming he or she gets paid at all.

Note #1: Human trafficking and migration pretty much go hand in hand, depending on how you see it. There is indeed a thin line between the two: that thin line may be a matter of whether the migrant ends up a victim to slavery or not.

A cost-minimizing effort

In the business side of human trafficking, firms hire traffickers to smuggle in people for the sake of lowering labor costs. They never expected to pay the victims a lot of money in the first place, otherwise they might as well pay for more "skilled" workers. A scholarly resource modeled this graph depicting demand for Human Trafficking victims. The limit paid for slaves is represented by the dotted part of the demand curve.

(Graph: Demand for Human Trafficking, Wheaton et al, 2010)

Hypothetically speaking, the range of hourly pay would be somewhere in the range 0 ≤ k < q, where k is the amount of money paid to the slave, and q is the lowest amount an employer would pay a regular worker. This is consistent with the viable price region labeled on the graph, where P-high can be seen as the final price before the employer would prefer to pay for regular, more "skilled" workers, again denoted by q.

A monopolistic-competitive look

Like a monopolistic competitive industry, the market for human trafficking victims is characterized by

many buyers and sellers
differentiated products (the laborers at various age, gender, and ethnic combinations)
easy entry and exit

The profits made in the human trafficking industry are rather paramount: about $32 billion in illicit profits a year is generated for traffickers. This attracts other firms or small-networked entrepreneurs to get into the mold and make profits; otherwise, they leave. For those who enter, they are able to at least control some of the price given the diverse group of people they have to target from within their communities.

Note #2: The models mentioned provide a glimpse of how things are seen on the business side to human trafficking, and it is heavily fueled by a combination of desired lower labor costs, hopes of a better life, and immense scarcity (in relation to other regions/countries).

Sources

Tamura. 2007. "Migrant Smuggling." IIIS Discussion Paper No. 207, Institute for International Integration Studies.
Mahmoud & Trebesch. 2009. "The Economic Drivers of Human Trafficking: Micro-Evidence from Five Eastern European Countries." Kiel Institute for the World Economy.
Friebel and Guriev. 2012. "Human Smuggling". IZA Discussion Paper No. 6350. The Institute for the Study of Labor.
Bales, Kevin. 2012. "Disposable People: New Slavery in the Global Economy." University of California Press.
Wheaton et al. 2012. "Economics of Human Trafficking.", International Migration Vol. 48, IOM.

Learn More

Robert Kosara on the value of illustrating numbers

Robert Kosara shared some valuable insight on the way numbers are illustrated when using visualizations. To read the full post, click here.

Some important points from the post:

Showing data isn’t always about trying to convey an insight:

It can also be a tool to communicate a fact, an amount, or an issue beyond just the sheer numbers. Although data illustration is poorly understood, it can be very powerful.

All data is not created equal:

You can turn any kind of data into a bar chart and get some sort of insight out of it. However, Kosara thinks some data just requires a bit more care and thought – not because of its structure, but because of what it represents. Kosara puts it nicely:

"When it comes to data about people, perhaps the approach needs to be a bit more thoughtful and respectful. Looking at data about homeless people, do we really need yet another goddamn bar chart? Is there not a more appropriate way to look at this data? Or think of the design process and thought that went into the 9/11 Memorial. This isn’t the phone book, these are all individuals who died in horrendous ways."

Getting the sense of the "number":

While numbers always invite comparison, there is a point at which comparison becomes distracting and an excuse to minimize the significance of the number. Kosara argues that we spend too much time comparing numbers instead of appreciating them. He uses the examples of gun and drone strike deaths, claiming that when try to derive the significance of, lets say, 2,300 deaths via guns, by comparing them to some other cause of death, then we have already missed the point; and I agree with him. The goal of these numbers is punch you in the gut, and make you feel some kind of way.

I hope to take these lessons to heart on my analytics journey. I wouldn't want to be obsessed with digits enough to be desensitized by the weight of what that number reflects. This should not only keep me aware, but guide my data visualization strategies as well.

Understand the different aspects of Human Trafficking

From Human Trafficking.org:

Sex Trafficking: Victims of sex trafficking are often found working in establishments that offer commercial sex acts, i.e. brothels, strip clubs, pornography production houses. Such establishments may operate under the guise of:

Massage parlors

Escort services

Adult bookstores

Modeling studios

Bars/strip clubs

Not every person working in these establishments will have technically been trafficked. It would be necessary for trained authorities or service providers to interview each person individually to determine trafficking.

Labor Trafficking: People forced into indentured servitude can be found in:

Sweatshops (where abusive labor standards are present)

Commercial agricultural situations (fields, processing plants, canneries)

Domestic situations (maids, nannies)

Construction sites (particularly if public access is denied)

Restaurant and custodial work

Another kind of labor trafficking includes child soldiers. In over 40 countries across the globe, thousands of children are being forced or tricked into becoming soldiers. It is most known to be present in conflict-riddled African countries like Somalia, Sierra Leone, Uganda, and Sudan, but there are also known instances in Asian countries, like Myanmar (formerly known as Burma), which was believed to have had the highest number of child soldiers at one point.

How Do People Get Trapped Into Sex or Labor Trafficking?

No one volunteers to be exploited. Traffickers frequently recruit people through fraudulent advertisements promising legitimate jobs as hostesses, domestics, or work in the agricultural industry. Trafficking victims of all kinds come from rural, suburban, and urban settings. There are signs when commercial establishments are holding people against their will.

Human Trafficking isn't just limited to sex and labor. Some kinds of human trafficking occur for the sake of transferring a victim's organs. Organ trafficking (for kidneys in particular) is a rapidly growing field of criminal activity. Kidney's are a high demand commodity, and these are the only major organs that can be wholly transplanted with relatively few risks to the life of the donor.

References:

Who Can Help in the United States? (humantrafficking.org)

Economic

Receive little or no payment
Have no access to their earnings
Be unable to negotiate working conditions
Be unable to leave their work environment
No days off
Work excessively long hours over long periods
Be under the perception that they are bonded by debt
Have had the fees for their transport to the country of destination paid for by facilitators, whom they must payback by working or providing services in the destination
Be forced to work under certain conditions

Social

Be subjected to violence or threats of violence against themselves or against their family members and loved ones
Be unable to communicate freely with others
They often have acted on the basis of false promises
Believe that they must work against their will
Be threatened with being handed over to the authorities
Be disciplined through punishment
Be in a situation of dependence
Have limited or no social interaction

Health (mental and physical)

Show signs that their movements are being controlled
Allow others to speak for them when addressed directly
No access to medical care
Suffer injuries that appear to be the result of an assault
Show fear, anxiety, or timid behavior
Suffer injuries or impairments typical of certain jobs or control measures
Act as if they were instructed by someone else
Be afraid of revealing their immigration status
Distrustful of the authorities
Suffer injuries that appear to be the result of the application of control measures

Living Conditions

Feel that they cannot leave
Be found in or connected to a type of location likely to be used for exploiting people
Not know their home or work address
Live in poor or substandard accommodations
Come from a place known to be a source of human trafficking
Have limited contact with their families or with people outside of their immediate environment
Be unfamiliar with the local language.

Other

Claims of just visiting and inability to clarify where he/she is staying/address
Lack of knowledge of whereabouts and/or do not know what city he/she is in
Little to no education
Loss of sense of time
Has numerous inconsistencies in his/her story
Have false identity or travel documents
Not be in possession of their passports or other travel or identity documents, as those documents are being held by someone else.

This list is not exhaustive and represents only a selection of possible indicators. Also, the red flags in this list may not be present in all trafficking cases and are not cumulative. Despite the presence or absence of any of the indicators, neither proves nor disproves that human trafficking is taking place, but their presence should lead to some investigation.

Learn more:

How King v. Burwell can negate the current Healthcare landscape

It seems there's always something about Obamacare (formerly known as the Affordable Healthcare Act) that attracts detractors. This time around, the fuss is over a matter of mere words.

An overview in brief:

King v. Burwell [formerly known as King v. Sebelius] challenges an IRS regulation imposed under the Affordable Care Act that allows subsidies on both state and federally-established health insurance exchanges. The belief, on King's side, is that the IRS regulation violates the plain language of the law enacted by Congress, which gave states the choice to either set up such exchanges themselves or stay out of the program.

An overview in videos:

What this means:

If the court rules in favor of King, then that basically negates all hope for eligibility of subsidies to people in the states without established state exchanges.

Whether it will negatively affect the people in states with already established state exchanges remains to be seen. From what I know, it doesn't seem mandatory for the states with exchanges to get rid of them if the ruling goes in favor of King, but with the mandate null and void, it doesn't seem there's much big obstacles for the state lawmakers to hurdle if they do, besides public unrest.

Speaking of public unrest, it seems there is going to be a lot of angry people over this. Those who don't have access to Obamacare anyway would essentially be slipping back into the dark ages of the old healthcare system, and there are people in the states with exchanges who can potentially lose their healthcare if the state decides to denounce exchanges (something I don't find likely). Either way, that is still millions of people who may lose their healthcare.

And with the mandate gone, one of the ACAs main goals, reducing healthcare costs, gets sacked in the process. Hypothetically speaking, the mandate was what we were supposed to bank on to reduce healthcare costs for the country. Before the ACA, the healthcare system was plagued with high costs due to the large number of uninsured people. For the consumers, it was normal for coverage to be denied resulting from preexisting conditions. Moreover, the healthy would wait until they get sick and buy insurance, indicating the existence of an adverse selection problem. Adverse selection in medical insurance markets occurs when people purchase insurance to cover known conditions. Before the mandate was a reality, the costs go up for everyone because the sick were left to buy the insurance, and as they used more insurance and filed more claims, premium rates went up, which also meant higher taxes to pay for the uninsured when they ended up needing emergency medical care. This mandate eliminates these issue, so if about 37 states cant (or won't) access that mandate, then we may end up seeing the ACA running half-fast (or half-assed).

The status bar of Blogger or Blogspot is that rectangle shaped figure shown at the top of the ticket when we access the labels or do a search. By default, it has defined styles for the border, background and font, but it's neutral appearance is not always appropriate for the style you want on your blog. If you have a transparent background, for instance, the default status message might be hard to see once you put up a lighter wallpaper (usually an issue for the Awesome Blogger themes).

Customizing the status bar look

To customize this bar, it is necessary to add about 3 or 4 classes in the css section of the template:

status-msg-wrap (optional): the parent container of the bar is the place to define their location with respect to the inputs.
status-msg-body : Class responsible for defining the message style; changes font size and font color.
status-msg-border : Class that defines the edge, and changes the colors of that edge.
status-msg-bg : Class that defines the background color of the message bar.

I dont think its very necessary to accompany these classes with the top selector "#main" to overwrite the default values, but it is a nice to have it. It at least makes certain for sure that the code will work, but don't be surprised if there are no problems otherwise.

For you to implement these 3 or 4 classes, go to the Template section in your blog's Design page and click on Edit HTML.

Dashboard > Template > Edit HTML

All that HTML code is your Blog template, and it is up to you to make sure you don't do enough to screw up any necessary functionality in your blog, so before you begin this, do yourself a favor:

back up your template!

Copy and paste it into some text editor you got somewhere. That way you don't have to worry about losing everything you worked to add to your blog's template.

Okay, so now you're at the HTML page, which should look like this.

Now I want you to click anywhere in the code page (just to make sure your computer knows you're referring to the code box and not the actual webpage), and do a search (CTL+F). Search for ]]> </ b: skin> and once you do, you are to take the code to change the status bar and place it above (or before) the ]]> </ b: skin>.

Examples

(The CSS code goes before ]]> </ b: skin>)

With all four classes...

       
#main .status-msg-wrap {
width: 90%;
padding: 5px;
}

#main .status-msg-body {
font-size: 80%;
text-align: left;
padding: 5px 5px 5px 30px;
width: auto;
}

#main .status-msg-border {
border: 1px solid # a19a36;
opacity: 1;
}

#main .status-msg-bg {
background: # FFF9B3 center left no-repeat;
opacity: 1;
}

That will turn out to look like this.

.status-msg-body font-size : alters the size of the text in the status message
.status-msg-body text-align: sets the text to either the (center, left, right) of the status message box
.status-msg-body width: deals with the width of the status message box
.status-msg-border opacity: between 0 and 1, the closer to 1, the more visible the status message background is.

The way I did it just shows the designated label (I named the label "ECON") name so I have a quick understanding of what section I'm in.

       
/* Label Status Message
———————————————————————————————————————–*/
.status-msg-body {    /* This changes the font and font color  */
font: 100% Century Gothic;
color: #fff;
text-transform: uppercase;
font-weight: bold;
}

.status-msg-bg {    /* This changes the background color of massage bar */
background: #000000;
opacity: 1;
}

.status-msg-border {   /* This changes the color of the border */
border: 1px #e9d8d9;
opacity:0.7;
}

That will look like this .

Changing the Status bar message

To add text to your label header, return to Edit HTML, then use Ctrl+F / Command F to find this code:

<data:navMessage/>

There may be about two of these codes, and in fact, it may just look like this, and in that case, just get rid of the first instance of the code.

  <div class='status-msg-wrap'>
    <div class='status-msg-body'>
      <data:navMessage/> <!--THIS CODE, GET RID OF IT-->
    </div>
    <div class='status-msg-border'>
      <div class='status-msg-bg'>
        <div class='status-msg-hidden'><data:navMessage/></div>
      </div>
    </div>
  </div>
  <div style='clear: both;'/>
  </b:if>

Type the message you wish to appear on your label header, such as:

You are now viewing my rants on: <data:blog.searchLabel/>

Where <data:blog.searchLabel/> is the code that will reflect the link to your label (so if one of your labels is "Help" or "World News", this code will directly correspond to that).

Now, your status bar should instead say that, or if you want to just show the label name, just use the <data:blog.searchLabel/> code by itself in place of the first <data:navMessage/> and click save. It should look like this.

 <div class='status-msg-wrap'>
    <div class='status-msg-body'>
      <data:blog.searchLabel/>
    </div>
    <div class='status-msg-border'>
      <div class='status-msg-bg'>
        <div class='status-msg-hidden'><data:navMessage/></div>
      </div>
    </div>
  </div>
  <div style='clear: both;'/>
  </b:if>

Changing the label colors

Return to Edit HTML, then use Ctrl+F / Command F to find the same code we have been working with:

<data:blog.searchLabel/>

And to change the message color, font, or style, simply add the needed font tags.

<font color=”red”>You are now viewing my rants on: </ font>

Afterwards, just click Save. Your new label header should now appear whenever you click on any of your post labels.

Conclusion

The Blogger status bar structure is a bit complex, maybe poorly organized, so to customize this part properly, much care should be taken to overwrite all default styles.

Note: Strangely the status bar message repeated twice, only one of them has attributes for no show, apparently blogger team made the various elements of your templates steam.

Note: This post was heavily influenced by someone else's post. I felt there was a need to add a bit more pieces of information, so I adapted it, but you are more than free to check out the original source.

We all search for truth, when you think about it...

Just because someone holds a belief does not in anyway mean they should be left alone to it. That is not who we are as people. We long for relationships and part of our longing is our exploration of each other and who we are in relation to each other. It is easy to get frustrated with someone you don't agree with (over touchy subjects in particular), but we ought to remember that while people are equal, ideas are not. The world will challenge us everyday on what we believe, whether we argue about it or not, and we are always somehow putting our worldview to the test. If we understand certain things about what we believe to be false, then we will fix for that; otherwise, we continue to hold it. We do this because we long for meaning and significance in this life. We search for truth everyday, and we get closer to it (or we think we get closer) through engagement with each other, trial and error, exploring, and sharing ideas.

Don't forget inflation!!

What if the US gets caught in the deflationary whirlpool that Europe is currently in? Pervasive economic weakness in the rich world and a slowdown in Chinese growth are leading causes for the sinking rate of inflation over the last few years, and oil prices haven't been much help either, pushing inflation into negative territory across much of the euro area. With inflation rates have dropping below 1%, the possibility is out there that the US may soon join Europe in deflation.

Besides that, inflation definitely deserves an observation for reasons beyond the possibility of deflation. When we think about jobs and the employment scene, we think a lot about looking at unemployment rates, underemployment rates and even job growth rates, but inflation matters in this too. When the economy is strong, we should be seeing signs of inflationary pressures, but with these inflation rates, that is a big sign that there's room for improvement in the economy. One of those improvements ought to be in wages.

Wages are positively related to labor productivity, which means that when labor productivity increases, so do wages, hypothetically speaking. If nominal wages increase faster than increases in labor productivity, then we will have inflation in the economy, or should have inflation in the economy, assuming that we're not dealing with stagnant wages (which we are!).

Not all about Supply and Demand: the case of oil prices

It wasn't long ago when we thought oil was a scarce commodity in the US. Now we have more oil in our stock piles than ever before, but oddly enough, sometimes we see conflicting signals. Hypothetically, the excess oil should be leading to lowered prices, but the sometimes the opposite occurs.

For instance, Marketplace's Dan Weissmann (@danweissmann) made a short on fluctuating oil prices where he featured a story on oil prices going up because the inventory went up. According to his interviewee, Walter Zimmerman, prices went up out due to imperfect information. With oil supply in excess, some traders expected prices to crash, but they didn't, and that may have worried some traders, so they bid prices up. As a correction, the oil prices dipped for the next days, but Zimmerman claims that this kind of thing is common in oil trading.

Its one thing to know the factors that shape oil prices over time, but figuring out which of those factors are most important is really hard. Sometimes it is neither about supply or demand, but about debt. But perhaps this is just something common to the world of trading. Sometimes you expect certain outcomes, and the disappointing results remind you every now and then that you aren't a wizard.

Un- and Under-researched topics in (Counter-) Terrorism Studies

An article by Alex P. Schmid lists 50 topics in the field of (counter-)terrorism that are either un-researched or under-researched. Im not going to list them all here, but there are a few that caught my eye.

ISIS vs. al Qaeda: comparing and contrasting organizational structures.

ISIS's rise to infamy was rather fascinating. In 2014, they took over portions of Iraq and were aiming to achieve their goal of a caliphate. The world definately took notice, and ISIS might have been considered more dangerous a group than al Qaeda around that time, but the very fact that there was an al Qaeda and an ISIS shows that there was a different in ideology and (perhaps) management. What separates ISIS from al Qaeda? Are they organized any differently?

Terrorists released from prison: subsequent careers.

This would be an interesting find. What happens to extremists after being released from prison? Where do they go? How do these people attempt to re-assimilate into society? Do they try? Is their view on what society is and their path to extremism rather one and the same?

The targeting logic of terrorist attacks

(I think this was already explored in Berman's book, to a degree) What mode of attack do extremists use and under what circumstances do they use them? Why do they choose that particular mode of attack?

The terrorism - organized crime nexus: new insights and developments

This is by far one of the topics that I'm most interested in. Terrorism and organized crime aim at different kinds of ends, but find more use for each other than the public knows. What might we learn about (and from) the crossover between terrorism and drug trafficking? What about terrorism and human trafficking? etc.

Other:

Freedom of speech vs. incitement to terrorism: the response of the courts
Muslims, Christians, Hindus, Jews, and Buddhists under attack for their faith: a quantitative comparison in the post-Cold War period – claims vs. facts.
The Arab Awakening and its possible implications for terrorism and international counterterrorism cooperation.
Is there a disconnect between academic research on terrorism and the counter-terrorist intelligence community's knowledge (and knowledge requirements) regarding terrorism?
Non-violent popular revolt and Salafist Jihadism: competing paradigms for political change in the Islamic world.
New strategies for identifying and countering extremist ideologies on the Internet.
The rehabilitation of terrorists vs. the rehabilitation of common criminals in prison: recidivism records compared

Resources

Schmid, A. 2011. "50 Un- and Under-researched Topics in the Field of (Counter-) Terrorism Studies". Perspectives on Terrorism Vol 5, No 1.

Gretl: Picking between stacked time-series and stacked cross-sectional formats

When you're doing panel data regression in GRETL or any statistical software, you need your data formatted in a way that differentiates the time series dimensions from the cross sectional dimensions, which is more sophisticated and more informative in that matter. In GRETL, you can do this by allowing the software to "flatten" the data being used in two different ways:

Stacked cross-sections: the successive vertical blocks each comprise a cross-section for a given period.
Stacked time-series: the successive vertical blocks each comprise a time series for a given cross-sectional unit.

...but these two get confusing because they both seem to be the same thing, having the same options to set the number of cross-sectional units and time periods, regardless of whether you choose to make it 'stacked cross sections' or 'stacked time series'.

So it made me wonder whats so important about one versus the other, and whether it should matter which one you pick, and based off on the searches I made, it turns out that these options GRETL ultimately end up turning the data into just, stacked time series. A word from Allin Cottrell, one of GRETL's founders, sheds some light on this case.

Allin Cottrell:

...maybe it's a bit confusing but in fact it is intended. The main point is that a panel dataset _must_ be organized as stacked time series for use in gretl. "Stacked cross sections" is not an option for panel data in gretl, it's just a way of saying that your data are currently the wrong way round and need to be fixed.

So, if you go to "/Data/Dataset structure" and say that your dataset is stacked cross sections, gretl will reorganize it for you as stacked time series (a "physical" reorganization of the data). Note that if you then go back to the "Dataset structure" dialog your data will initially appear as "Stacked time series", so, naturally, stating that the structure is stacked time series will produce no change.

If you want to change the actual data layout back to what it was originally, you have to say that it's stacked cross sections (again), and gretl will reorganize the data in the opposite direction.

Perhaps that could be easier for users to understand in later updates.

Migrations and Diasporas of New York

A fascinating NYT post describes our migration patterns to and fro each state.

The migration data is based on Census data, particularly ipums, which was used to compare the state of residence versus the state of birth of a representative sample of Census forms. According to information aesthetics, the visualization technique used in this post "resembles that of organically shaped, stacked area graphs, also coined as stream graphs or ThemeRiver.

For a glimpse, here is are two visuals on the migration patterns in NY.

MIGRATION INTO NEW YORK (NYT):

The image of New York as a beacon that attracts all is definitely true for immigrants, but for people born in the United States, the picture is more complicated. With the recent growth in immigration, the percentage of foreign-born residents in New York is approaching Ellis Island days. But domestically, one of the less-noticed trends is the decline in population of blacks born in other states. Since 1980, the population of Southern-born blacks has declined by more than 350,000. You can also see in the "U.S. Other" category the impact of migration from Puerto Rico, which was particularly large in the post-war years.

DIASPORA OUT OF NEW YORK (NYT):

Of the 20 million New Yorkers alive today, nearly one in six are now in the South, an idea that would have been almost unthinkable 50 years ago. Florida is still the main attraction, followed by New Jersey. But in terms of growth, since 1980 the number of New Yorkers living in South Carolina increased by about as much as the number living across the Hudson River. As our colleague Nate Cohn puts it, one of the state's leading exports is its people.

The basic logic of Non-linear Modeling: Binary Outcomes

Non-linear regression is very similar in spirit to linear regression and the general linear model. In this situation we are dealing with binary linear outcome, and we are going to have to end up modeling it on a transformative, non-linear scale. So what that means is we will start with something that is measured as a yes or no, one or zero, and then transform it to a probability. And the thing we're going to estimate is a linear function of our predictors, transformed into a non-linear outcome. Might sound a bit confusing, but just hang in there.

Regression with continuous dependent variables

Lets say you're all of a sudden interested in the attendance to baseball games. You want to understand what affects the level of attendance to baseball games, lets call it ATTEND (in thousands of people), and you believe that a current wins in the season have a say in how the level of attendance to the baseball games look. In other words, you believe there is a linear relationship between attendance to a baseball team's games and the wins they have in a season. This is called a linear model, and in that case, we can notate the model like this:

or simply...

Based off this linear relationship, estimating this model is fairly easy. You can observe the relationship between the two using methods of summary statistics, like scatter plots, and based on the apparent upward trend in the plot points, you may infer that there is a positive relationship between the X and Y variables, probably leaving you with the question of how strong and significant the magnitude of this relationship is.

With this particular example, you're bound to receive regression estimates that confirm the positive relationship between the two variables.

 Dependent variable: ATTEND

              coefficient    std. error   t-ratio   p-value
  -----------------------------------------------------------
  const        −1340.92      387.127      −3.464    0.0009    ***
  CURNTWIN      38.6325      4.74786       8.137    6.02e-012 ***

And when you interpret the magnitude of the coefficients, you can conclude that for every unit increase in CURNTWIN, ATTEND sees an increase by 38.6325. In laymen's terms, for every additional win a baseball team has, their stadiums are expected to experience an increase in attendance by about approximately 38 to 39 people.
OLS models are very popular approaches to regression analysis, and in these models we're dealing with dependent variables that have continuous values.

Regression with Dichotomous Dependent variables

Lets change that up and assume that instead of baseball wins and attendances, you want to find out the probability that a person attends college given the percentage level of their parental wage. We can just write that as...

or simply...

We're going to move away from the linear combination of the independent variables (LCIV) because they actually give off nonsensical results. For instance, a regression equation with a continuous variable like income can provide useful information when you generate graphs, like a scatter plot, but with dichotomous variable as Y, we get something where all the plot points are scattered to either one value or the another, like this:

The points are bounded to either 0 or 1, which makes a lot of sense (if you're familiar with dummy variables) because 0 and 1 represent outcomes (0 ="didn't attend college" ; 1 = "attended college").

So the rectification is to take a non-linear transformation of the linear combination of independent variables, looking like this:

The properties we expect function F to have reflect the discrete outcomes of interest (as related to qualitative variables). In that role, function F is expected to output 0 as the linear combination of independent variables tend towards negative infinity, and F is expected to output 1 as the LCIVs tend towards positive infinity. That's just a mathematical way of stating this hypothetical point, that as we see a continual increase (or decrease) in the value(s) we get from the LCIVs, we should also gradually expect the outcome of interest (Y) to be a likely occurrence (or likely nonoccurrence).

The corresponding math notations usually looks like this:

F(-∞) = 0

F(+∞) = 1

and thus, this transformation of the independent variables means that our probability belongs in the range of 0 and 1, or...

In context, lets assume that we see the value for X as very high, indicating high parental income; in that case, then we can assume that the probability of a person going to college is virtually 1, or very likely to happen.

Conclusions

The basic logic of logit and probit modeling is grounded on this reasoning, and will often give similar results in their estimations. Instead of trying to see the effects of the X var(s) on the continuous Y variable, like the effects of income on overall consumption, you're trying to see the how the X var(s) influences the probability that dichotomous Y variable will occur; in other words, the likelihood of outcomes.

It should be understood and reminded that yes, we can expect a higher or lower probability of an event occurring but that does not mean it will indeed happen or not. The ultimate outcome can most definitely go the other way.

Note: Regression equations always include error terms, which were purposely excluded in this post. Just know that you are to always include the error term when conducting analyses for academic or professional work.

Sources:

Heavily borrowed and influenced from:

YOUTUBE: Discrete choice models - introduction to logit and probit

Visualizing the Middle East

A neat interactive graphic by Information is Beautiful depicts the network relationships among the key players in the Middle East (click on the picture to see it).

Notes:

The amount of Sunni to Shia groups is astounding. The world simply thinks about the Muslim community as just one community, but the sectarian divides are serious among the Middle Eastern Sunni and Shia groups. Although I was aware of the sectarian divide, I dont think it was until ISIS's rise in 2014 that I began to see just how serious it can get between differing Muslim groups.
ISIS hates...everyone? I didn't really see that coming. I knew ISIS was infamous, but I didn't think they practically had enemies all around them. Maybe this doesnt mean too much, being that al Qaeda, the group it spawned from, has way more negative connections. However, AQ has way more years of existence and picked up more negative relationships over time.
'STATE OF PALESTINE": I know there are many of those out there who have sympathies for what happened to the Palestinian people as Israel grew and probably want to use the term Palestinian state as a way to recognize its existence, but its just not correct. Palestine is NOT a recognized state, and never was. If it was, then the modern day Israel probably wouldn't have existed.

8 ways a college gap year can benefit you

When I graduated college, I decided to take a gap year from school. It’s not because I was tired of school: in fact, I usually looked forward to attending college. I thought a year off would just help me get time to take the GREs, and maybe I could get a job in the meanwhile.

But what took me a while to realize is that taking a gap year actually helped me grow in other ways I never thought I would, or could. So here’s a few things to think about: 8 ways a college gap year can benefit you.

1. A chance to see things on the other side.

If you're taking a gap year after 18 or so years of formal education you will begin to realize that life is different on the other side. The fall and spring semesters, after going through so much routine, will be filled with a lot of void, and its weird. All of your other friends will continue with their studies and you're not. Sometimes you may get distanced from other friends, as your interests end up differing.

This may not seem beneficial but best believe it is. When we are in school, we tend to misunderstand what many call the "real world" because many of us haven't experienced it just as yet. The reality is, unless you're working at school institution right after you graduate, you're not going to be in school for very long, and the school norms you are so used to are not the same as business norms you will have to learn about. It just puts you in perspective.

2. Opportunity to enjoy leisure

While you're on that gap year you might as well have fun with the free time you have. Its most likely a top 3 reason why you're taking the gap year, so why not catch up on your favorite books and TV shows, discover new music, or go on vacation? Chances are you will need it, but we know you will enjoy it, and you probably won't get a chance like that for a long time once you return to formal education.

3. Saving money

I don't know about you, but I'm pretty happy that I don't have to worry about taking out $20,000 in loans just to pay for another years' worth of education. This is especially so if I don't like the direction I'm going with my current major. So while I'm off, I might as well take the time assess whether it will be worth the extra money to continue down this path.

4. Time to learn and grow / Self-assessment

This is quite important because it gives you the chance to learn more about yourself. You learn more about your passions, learn more about what you like, and what you dislike. You think about things you've done in the past and whether you would like to continue doing it in the future. You also get to think about what you could have learned to do better and what you can do to improve yourself in what ever area you want to improve in. Depending on that area, you might be motivated to return to college.

5. Skills-assessment

I know this is closely tied to the previous point, but I find this one huge enough to stand on its own.

Before you apply for a job, it is important to distinguish between the skills you have and the skills you need to attain. When you’re going through a gap-year, sometimes you may apply for jobs you want and realize you’re not qualified. So you see that, and it can be discouraging for you because you don’t have the skills, or it can be encouraging for you to learn what you need to know before you apply for jobs like that in the future. Some skills can be attained during your gap year through outlets not labeled formal education, but there are skills that you will realize can be best attained and practiced if you were in college. It just adds to your motivation and gives you a focus to seek out those sources where you can properly learn the desired skill(s).

6. Self-fulfillment

A gap year is a chance for you to take a breath and do something that you want to do.

Take on a volunteer project or a personal project. Give yourself a chance to go through some growing pains that you really find worth the trouble; its more rewarding that way. Maybe you can spend time building and writing a blog. Maybe you want to learn new things through some massive online open courses (MOOCs), like code or maybe a new language. Maybe you want to do some research on topics that interest you, or maybe you want to join and volunteer for a cause. What ever it is you want to do that can give you a real sense of personal achievement, a gap year is a special time to get those done.

7. A chance to develop professionally

A gap year also gives you a chance to explore careers and increase your work experience, through paid employment or voluntary work. It helps you develop your professional networks, improves your skillset, and it gives you some experience to put on the resume.

It will look very good on a resume and it shows that you've been busy during your time off. Things like this are very notable to future employers and colleges.

8. A better focus when you return

During your gap year, once you find that crossover between your interests and you think you can pursue you begin to understand more specifically what you want out of college (other than that diploma). Whether its of the undergraduate, graduate or post-graduate kind, you never want to leave feeling that you graduated with a useless degree. That very feeling can be a reason for pursuing a gap year in the first place. Maybe there was some kind of skill that you would find very useful that you failed to pick up as an undergrad or your previous years of schooling. The gap year will give you the time to think about topics such as your pursuit intentions and the channels that can help you with that pursuit, which may help you realize which college you should really attend.

Conclusion

If or when you decide to take a gap year, remember that it can indeed be a welcome break, but it can also be frustrating. It may be a way to regroup and recharge your battery or a way to explore your life in ways that you never had time to before. However, you can also end up zoned out in front of the TV watching reruns or steamed up behind a computer wasting your life away, if you have lofty plans for the gap year. I can't stress enough just how easy it is to end up in the latter.

Thursday, March 19, 2015

Importing the data

The Algorithm

Subsequent Analyses

Learn More:

Sunday, March 15, 2015

Socioeconomic conditions

People as "commodities"

Incentive to Shut up

Labor Supply: mostly voluntary

A cost-minimizing effort

A monopolistic-competitive look

Sources

Learn More

Monday, March 9, 2015

Showing data isn’t always about trying to convey an insight:

All data is not created equal:

Getting the sense of the "number":

Sunday, March 8, 2015

References:

Other sources you can read from:

Economic

Social

Health (mental and physical)

Living Conditions

Other

Learn more:

Wednesday, March 4, 2015

An overview in brief:

An overview in videos:

What this means:

Customizing the status bar look

Dashboard > Template > Edit HTML

back up your template!

Examples

Changing the Status bar message

<data:navMessage/>

You are now viewing my rants on: <data:blog.searchLabel/>

Changing the label colors

<data:blog.searchLabel/>

<font color=”red”>You are now viewing my rants on: </ font>

Conclusion

Tuesday, March 3, 2015

Sunday, March 1, 2015

Friday, February 27, 2015

Friday, February 20, 2015

ISIS vs. al Qaeda: comparing and contrasting organizational structures.

Terrorists released from prison: subsequent careers.

The targeting logic of terrorist attacks

The terrorism - organized crime nexus: new insights and developments

Other:

Resources

Friday, February 13, 2015

Wednesday, February 4, 2015

Monday, January 26, 2015

Regression with continuous dependent variables

Regression with Dichotomous Dependent variables

Conclusions

Sources:

Friday, January 9, 2015

Sunday, January 4, 2015

1. A chance to see things on the other side.

2. Opportunity to enjoy leisure

3. Saving money

4. Time to learn and grow / Self-assessment

5. Skills-assessment

6. Self-fulfillment

7. A chance to develop professionally

8. A better focus when you return

Conclusion