I have a Data Set of online product-reviews (without any grades/stars/etc.). To this data-set I applied the integrated PowerBI AI-Insights Text Analytics Sentiment Analysis model and got a a sentiment score for each review. Next, I transformed the score into textual discrete values: POSITIVE, NEGATIV and NEUTRAL.
The dataset is artificially created by me, so I know the polarity of each comment. Now I want to compare the predicted value to the actual value. I've done this by adding a new column that compares the actual value with the predicted value and displays "PREDICTED" if the correct value was predicted and "NOT PREDICTED" if the prediction was false (it doesn't matter if it is positive, negative or neutral). My goal is to calculate some model metrics so I can evaluate the capabilities of this PowerBI integrated model and to visualize the results. How can I do this? Is "accuracy" the first thing that I have to start with? If yes then how can I calculate and visualize a result like the "accuracy".
Thank you for all your answers in advance.
Yes, take accuracy in first consideration. If you find 70 or 80 percent above results are accurate, you can easily rely on the PowerBI AI-Insights Text Analytics Sentiment Analysis. You can then create your visuals for Sentiment data. But if there is 50-50 occurrence of predicted and not predicted result, you may go for 3rd party Sentiment analysis service like - Google, Alchemy.
Related
I have bar chart below which looks like below - in that i have measure for Achievement % (in this graph 100% showing with two decimal i.e. 100.00%)
but i got a feedback from leadership that "100% value should not be in decimal". I played a lot with dax and formatting but did not get the result like below. (like shown in below graph, i want only 100% without any decimal i.e. 100% and rest all value should be with two decimal)
is this possible in PowerBI like excel. if yes then what will be the workaround.
Please suggest
Regards,
SK
I am trying to create a forecast but this is the error that I get:
I am working with about 300,000 rows of data. Most of the report has already been built. My data just doesn't cotain certain dates. How can I solve this issue?
So the issue boils down to the problem of "How to create an evenly spaced timeline". You can easily achieve this in PowerQuery
Create a separate daily date table.
Outer join your observations onto the dates, which will give you "null" for the unobserved days
Apply the "fill down" operation on your values column, which basically means that the last value will be repeated until a new observation appears.
These evenly distributed time series is suitable for ML forecasting, at least when it comes to predicting trends. But the real power of this feature in Power BI is in predicting seasonality, and you most likely won't get that right with the above interpolation.
I am working on a research project that will relate the sentiment analysis done on tweets with financial markets indexes, such as S&P500 and VIX. My work is based on the Tetlock (2007) paper.
So I have categorized every word in every tweet according to Harvard IV psychological dictionary and then summarise it on a daily basis (i.e. got the frequency of each category per day). Then, rescaled my frequencies dividing by total number of words of all tweets on that day. Also, I have selected only a few categories and not all 180+.
The idea is to construct a factor that captures the latent sentiment on those tweets and the obvious choice is to run a PCA on my frequency sentiment categories data (it is also Tetlock's approach).
My issue is that categories such as Pstv and Ngtv have the same loading signs for the first factor, while I expected them to have opposite signs. Here is a print screen of the R console where we see the loadings and here the screeplot of the components.
Any ideia why this would happen?
In PowerBI I'd like to build Non-standard matrix very similar to the report in Google Analytics.
What do I have now:
I want to change my subtotal to measure, which is calculated as the difference in percentage of the two values
What I want to get:
In Power BI, there is no way to override the subtotals of a matrix with a calculation. Part of the challenge is that you know there are only two date ranges, but as far as Power BI is concerned, there could be any number of date ranges.
It's difficult to tell from your question exactly what input you have and what output you're looking for. Further, the numbers in your screenshots are obscured. However, one consideration would be to solve the problem using measures (i.e. a measure representing the first date range, a measure representing the 2nd date range, and then a measure calculating the difference between them). You may need to change the layout of your visual a little to make this work and the specific design would depend on how static your date ranges are.
I have looked into the responses of "ItemSeach ()" and "lookUp()" functions in Amazon Advertising API and
could not find a possible way to get daily/monthly sales of an item.
Popular product research software like , JungleScout, ProfitPhonix, AMZ tracker etc do display Number of monthly sales but all of them show different results.
Does Amazon provide this information ? If not then how the above software are estimating it?
I think when they fetch the ASIN information, they do store "some thing" in their DB and next time when the same ASIN is pulled again then the estimated sales are roughly calculated based on DB previous value/score.
Any help will be highly appreciated .
Thanks
It is not a solution, but here is a reply from UnicornSmasher I found, it may help to save time searching for something that doesn't exist.
constantine We just took all of the bulk data from the products that are being tracked in AMZ Tracker and applied a formula to it all. If you have specific products that are way off please let us know! Certain categories we had less data on. This is version 1 of the research tool, so I'm sure it will continue to improve quickly over time.
Here is the link to question and answer:
amz forum
So, now, the question is 'What formula do they use?'
Let me know if you come up with an idea :)
Let me tell you first that if you're not the part of the Amazon data team you can't get the sales numbers of any product. And, its probably not easy to estimate sales using Amazon advertising API. You need to constantly track a huge number of products to estimate the sales. Here I can explain how AMZ Insight an Amazon tracking tool estimates the sales of any product.
They constantly track a few thousand products from all the categories and collect massive data. Then their in-house data scientist analyze the data to form the sales estimating algorithm. Relationship of multiple data points plots a scattered graph which means of course sales estimates are not 100 percent right.
Data is continuously gathered and analyzed by tracking the Best Seller Rank (BSR), Buybox, reviews and more factors. Then the relationship between this data is formed to come up with the unit sales. Once this relationship is in place then it is much easier to estimate monthly sales and revenue for the product.