Separate webchat conversation into one message per row - sas

I have a SAS dataset with the transcription of a web chat in one row
Example here:
conversation id = 346768584212
Transcript =
11:13:57 info: Thank you for choosing to chat with us. An agent will be with you shortly...11:13:58 info: You are now chatting with Harsh...11:14:00 Shahid: Hello..11:14:03 Shahid: HI Harsh..11:14:25 Shahid: I have received two customers numbers one for personal banking and one for business banking..11:14:30 Harsh
I'd like to use SAS to split this out so that there is one row per message
Example:
conversation ID
346768584212
346768584212
Message
11:13:57 info: Thank you for choosing to chat with us.
11:14:03 Shahid: HI Harsh..
Cant figure out how to split it by timestamp - any advice would be much appreciated
Thanks
Tom

should be pretty easy to do. Here's a reference:
sas regular expressions
check out PRXNEXT on page 12. your regular expression would be "/\d\d:\d\d:\d\d/"

Related

How do I match a group of text under a title that changes

New to this regex and everyone here has been an awesome resource for help but I’m running up against the wall and no matter what I cant see to get the grouping to work.
I’m looking to match the name of the room and the products and services that belong to that room. The number of rooms can vary same with the names, the description of the product or service may change but the line will always start with “Product” or “Service”.
If anyone can point me in the right direction it would be truly appreciated.
Master Bedroom
Product description of the product
Product description of the product
Service description of the service
Kitchen
Product description of the product
Services description of the service
You will probably get better results if you can use a regex alongside a bit of postprocessing. For example, the following regex will match all of the service/product lines:
(Product|Service[s]?)(.*)
But you will still need to get the name of the header. You could perhaps start with something like this:
(.*)\n((Product|Service[s]?)(.*)\n)+
In which case your capturing groups will include the name of the heading and then ALL of the lines in that section; you can then split and process each with the first regex I provided.
If you're able to share which programming language/tool you're using to run this processing, I can help you write the code to split the data correctly from the first regex.
You can look at this regex in action at regexr:
For the input string:
Master Bedroom
Product Bedknobs, cheap
Product Beautiful carpet polish
Service Free pillow sharpening
Kitchen
Product Sink grease
Services Inexpensive cucumber delivery
You will get the following groups:
Master Bedroom
Product Bedknobs, cheap
Product Beautiful carpet polish
Service Free pillow sharpening
and
Kitchen
Product Sink grease
Services Inexpensive cucumber delivery
[edit] note that this regex WILL capture the "Product/Service" string as its own group... Figured you could always throw it away if you didn't need it, but didn't hurt to have access to it after parsing :)

Is there a limit to how long a filename URL statement can be?

I am on design number three I think now of a program that submits a series of stock tickers and metrics to Yahoo Finance. I don't need to go into too much total about what it does as I have got most of it up and running now apart from one remaining issue.
The Yahoo Finance site lists about 2700 stock tickers on the NASDAQ alone. I anticipated that submitting all of these in one filename URL statement might fall over for some reason, so set an initial string length of 500 tickers and built some nested macros to iterate through in 500 ticker blocks until everything I wanted had been extracted.
However during development of the code it seems that if I build a string with any more than about 200 tickers in I get an error telling me that SSL Support cannot be run and the code falls over.
Does anyone have any idea why this is? In ideal world I would like to be able to do this code in one pass where all 2700 stock tickers are pulled down. If this isn't possible if someone could explain why not that would be great.
Thanks

Getting stocks by industry via Yahoo Finance

i want to list all available industries ( like: http://biz.yahoo.com/p/ ) and show all corresponding stocks.
Until now I'm using YAHOO.Finance.SymbolSuggest.ssCallback for the symbol suggestion and http://finance.yahoo.com/d/quotes.csv?s=... for getting the stock's data.
Does anyone have any idea how to get all industries and corresponding stocks?
Is there another hidden Yahoo API?
Lists of all available industries are called GICS Sectors for Standard and Poor's (S&P500 will use that) and ICB for Dow Jones and FTSE. Hence it used by Nasdaq, Nyse and others markets.
It seems like Yahoo uses a third industry classification by Morning Star, but since I'm not quite sure I will give both ways of retrieving data.
Morning Star
I don't know if Yahoo really sticks to this classification, but some names were really close so let's see it:
You need to go to their Index Data and in each sector, click on it and then at the bottom View complete index holdings.
It's not as precise as in Yahoo industry list, but it's all you can do with Morning Star. Not very convincing, I know...
GICS Sectors
GICS Sectors are now a trademark of Standard and Poor's and then data have to be sought for in S&P's website.
Short answer: take a look at this page, you will need to be registered (it's free and easy) and you can download spreadsheets (xls) with stocks and corresponding sectors. Nevertheless, things aren't always easy, and you will have to do a bit of a search to retrieve all stocks with their corresponding industries. For example, the file INDICATED_RATE_CHANGE.xls will give you some companies and their sectors in each month of 2012. Using that and SP500_DividendAristocrats_2012.xls you should be able to retrieve at least a large part of S&P 500 companies.
ICB
ICB is used by NYSE, NASDAQ etc... Then it's a lot simpler than S&P and MorningStar. Here is your answer. BOOM! Direct link!
Link is dead :(
Finally
I strongly advise you to use the simpler and most-used industry classification index: the ICB. It will always be available and publicly displayed since millions of investors relay everyday on it, without having to use S&P financial services or MorningStar brokerage services...
EDIT
You can look at nasdaq.com to retrieve all companies and their corresponding sector: here for Nasdaq and here for Nyse
Get all industry-IDs from here:
http://biz.yahoo.com/ic/ind_index.html
(look at the links)
Then use YQL ( https://developer.yahoo.com/yql/console/ )
with a query like this:
select * from yahoo.finance.industry where id=912

How can I get x most valuable companies using some finance APIs?

I need to find a Web Service which allows me to retrieve the following type of data:
The 30 most valuable companies, and for each company the following information:
Company name, symbol, state and zipcode
Current market price of the stock
Change in the price of the stock since yesterday’s market close.
Beta value of the stock
Thanks in advance!
I would expect to obtain this information from Forbes. As Steve suggested, you should get it from xignite.com . Apparently, Forbes makes their data available through them:
http://www.xignite.com/News/PressRelease.aspx?articleid=86
You might also have a look at Yahoo Finance: http://www.gummy-stuff.org/Yahoo-data.htm.
Here some example This is not oficial API and there is no documentation.

Yahoo! Finance API DOW [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
Until now, I've been using the INDU ticker to follow the DOW with the Yahoo! API. For whatever reason you were unable to directly follow ^dji ^djia or any other reasonable combination. Up until yesterday, INDU was working fine. However now I receive no data when requesting indu.
What other ticker can I use with the Yahoo! finance API that will return the DJIA?
This index is not available under any other name.
However, this problem was just a temporary glitch, now resolved by Yahoo. Unfortunately, their financial data availability is very flaky lately. E.g. data available on the web page, but CSV downloads give "N/A" for all fields, etc. There were similar incidents in recent months, with stock prices for random stocks given wrong values, and more.
So, if you're building a new service around these Yahoo services, be aware that:
These services are not reliable.
You're breaking Yahoo ToS, so there's nothing you can do if they are broken / not working, you cannot even complain to Yahoo in good faith.
According to Yahoo (post by Yahoo Developer Network Community Manager Robyn Tippins on Yahoo developer forums):
The reason for the lack of documentation is that we don't have a Finance API. It appears some have reverse engineered an API that they use to pull Finance data, but they are breaking our Terms of Service (no redistribution of Finance data) in doing this so I would encourage you to avoid using these webservices.
The formula for the DJIA isn't very complicated. If you are still able to pull quotes from individual stocks, you could use your code to pull the prices of the existing 30 components of the DJIA, add them up and divide by the current divisor. Of course, this has several disadvantages.
You need to make 30 requests instead of one.
You will have to adjust the divisor if there is a stock-split.
You will have to change the the queries when the components
change.
The components of the DJIA are
AA AXP BA BAC CAT CSCO CVX DD DIS GE HD
HPQ IBM INTC JNJ JPM KFT KO MCD MMM MRK
MSFT PFE PG T TRV UTX VZ WMT XOM
The current divisor is 0.132129493.
The divisor changes whenever there is a stock split in on of the components. The components of the DOW changed 48 times from 1896-2009.
It seems like Yahoo Finance does not support the web service to query ^DJI or INDU.
Check out this discussion:
http://developer.yahoo.com/forum/General-Discussion-at-YDN/Dow-Jones-Industrial-Average-Quote-Error/1317052217631-f9173931-04fd-4519-b1b3-efb65d7ff8fa/1317065435082
Assuming that your application does not need to be real time market data (to the second), you can use the RAW data that is provided to build the interactive graph on yahoo. This data is comma separated and updates about once every minute. The downside: it will include all the data from the trading day. The time given is in Unix time so a conversion would be needed. I tried this out for the ticker symbols you listed and the only one I was able to get data with was ^dji. Hopefully this is what you are looking for!
You can mess with the link and see what happens to the data. For example you can change the amount of days.
http://chartapi.finance.yahoo.com/instrument/1.0/%5Edji/chartdata;type=quote;range=1d/csv/
I think Yahoo Finance All Currencies quote API Documentation will help you.
I found a Yahoo forum answer that says we cannot download CSV data for ^DJI.
Check also YQL console. This console will fetch values in JSON format.
The DIA ticker (SPDR Dow Jones Industrial Average) closely imitates the Dow.