List.Generate() generating list of same values - powerbi

I am trying to iterate through the cursor over web API, My code
let
FnGetOnePage = (cursor_ as text) => Json.Document(Web.Contents(URLEndPoint,[
Headers = [#"Content-Type"="application/json", #"Authorization"=token],
Content = Text.ToBinary("{""kind"": """ & kind & """,""limit"": """ & Number.ToText(100) & """, ""cursor"": """ & cursor_ & """}")
]))
GeneratedList = List.Generate(
()=> [res = FnGetOnePage("")],
each [res][cursor]<>null,
each [res = FnGetOnePage([res][cursor])]
)
in
x
This is giving me the same list of output responding to the first cursor

Related

How to divide Spark DataFrame one row by its next row value using Scala functions

I have a Spark Data frame like this and want divide its first row by next row using Scala only and not by Spark Functions. Where getting some issues and looking for suggestions.
+------+
|number|
+------+
|24 |
|12 |
+------+
Here output should be 2 as dividing 24/12 = 2
Below is the code what I am trying :
val spark = SparkSession
.builder()
.appName("Spark Test")
.getOrCreate()
val testDF1 = spark.sql("""select string("24") as number""")
val testDF2 = spark.sql("""select string("12") as number""")
val testDF = testDF1.union(testDF2)
testDF.show(false)
import org.apache.spark.sql.functions._
def divideRows = udf((columnValue: String) => {
var result = ""
val dfAslist = testDF.collectAsList()
result = dfAslist.get(0)/dfAslist.get(1)
result
})
//Applying UDF Method on DataFrame Column
val finalDividedDF = testDF.withColumn("dividedValue", divideRows(testDF("columnValue")))
Output value of this should be 2. Please help.
Although this task can be solved with a simple Window operation if you need to do this custom aggregation without using SparkSql functions you must use an Spark UDAFs. The problem here is that "in a grouped aggregation, aggregate values (e.g. counts) are maintained for each unique value in the user-specified grouping column. In case of window-based aggregations, aggregate values are maintained for each window the event-time of a row falls into". you can use your custom aggregation extract the result in an Array and then explode that column.
Your UDAF code could be something like:
class GenericOperation
(func: (Double, Double) => Double)
extends UserDefinedAggregateFunction {
private def combinations2[A](input: List[A]): List[(A, A)] =
input match {
case h :: t if !t.isEmpty => (h, t.head) :: combinations2(t)
case h => Nil
case Nil => Nil
}
/** Internal buffer to apply transformations **/
private val bufferType = ArrayType(DoubleType)
override def inputSchema: StructType =
StructType(List(StructField("value", DoubleType)))
override def bufferSchema: StructType =
StructType(StructField("internal_buffer", bufferType) :: Nil)
override def dataType: DataType = ArrayType(DoubleType)
override def deterministic: Boolean = true
override def initialize(buffer: MutableAggregationBuffer): Unit = {
buffer(0) = Array[Double]()
}
override def update(buffer: MutableAggregationBuffer, input: Row): Unit = {
buffer(0) = buffer.getAs[mutable.WrappedArray[Double]](0).+:(input.getAs[Double](0))
}
override def merge(buffer1: MutableAggregationBuffer, buffer2: Row): Unit = {
buffer1(0) = buffer1.getAs[mutable.WrappedArray[Double]](0) ++ (buffer2.getAs[mutable.WrappedArray[Double]](0))
}
override def evaluate(buffer: Row): Any = {
val sequence = buffer.getAs[mutable.WrappedArray[Double]](0)
val c = combinations2(sequence.toList).toList
val combination = c.map(el => func(el._1, el._2)).toArray
combination
}
}
One example application:
import org.apache.spark.sql.functions.{lit, max}
implicit val spark = SparkSession.builder().master("local[*]").getOrCreate()
val data = Seq(Row(1, 24.0), Row(2, 12.0), Row(3, 48.0), Row(4, 56.0))
val df = spark
.createDataFrame(spark.sparkContext.parallelize(data),
StructType(List(StructField("order", IntegerType), StructField("value", DoubleType))))
def main(args: Array[String]) : Unit = {
import org.apache.spark.sql.expressions.Window
import spark.implicits._
val aggregation = new GenericOperation((a, b) => a / b)
df.agg(aggregation('value))
.flatMap(el => el.getAs[mutable.WrappedArray[Double]](0)).show()
}
the output is:
+------------------+
| value|
+------------------+
| 2.0|
| 0.25|
|0.8571428571428571|
+------------------+
I didnĀ“t check with a structured streaming source. I think it will work but, besides this a little weird implementation(it should be done with a window, if possible), problems arise what happen between mini batches?. You can better try this option and tell me the problems that you find.

List Comparison in Python with Diff as output

The below script i have to compare Test1 vs Test2.Test1 and Test2 data is mentioned in the bottom .I tried to make it a generic one so that it will work for different devices also.The below script i have to compare Test1 vs Test2.Test1 and Test2 data is mentioned in the bottom .I tried to make it a generic one so that it will work for different devices also
import re
data_cleaned = {}
current_key = ''
action_flag = False
data_group = []
if_found_vlan = True
output = open('./output.txt','r').read()
switch_red = re.findall(r'(\w*-RED\d{0,1})', output)[0]
switch_blue = re.findall(r'(\w*-BLUE\d{0,1})', output)[0]
for line in open('./output.txt'):
m = re.match(r'(\w*-RED\d{0,1}|\w*-BLUE\d{0,1})# sh run vlan \d+', line)
if m:
if not if_found_vlan:
data_cleaned[current_key].append([])
if_found_vlan = False
current_key = m.group(1)
if not data_cleaned.has_key(current_key):
data_cleaned[current_key] = []
continue
mm = re.match(r'vlan \d+', line)
if mm:
if_found_vlan = True
action_flag = True
data_group = []
if action_flag and '' == line.strip():
action_flag = False
data_cleaned[current_key].append(data_group)
if action_flag:
data_group.append(line.replace('\r', '').replace('\n', ''))
if not if_found_vlan:
data_cleaned[current_key].append([])
#print ("+++++++++++++++++ The missing configuration ++++++++++++++\n")
print switch_blue + "#" + " has below missing VLAN config\n "
p = [item for index, item in enumerate(data_cleaned[switch_blue]) if [] != [it for it in item if it not in data_cleaned[switch_red][index]]]
print('\n'.join(['\n'.join(item) for item in p]))
print ("+++++++++++++++++++++++++++++++\n")
print switch_red + "#" + " has below missing VLAN config\n "
q = [item for index, item in enumerate(data_cleaned[switch_red]) if [] != [it for it in item if it not in data_cleaned[switch_blue][index]]]
print('\n'.join(['\n'.join(item) for item in q]))
Update with your raw output, I think use a one-dimensional list to represent your output is not a good way for further handling.
When we handle a data, we first need to clean the data & setup a model easy to handle for further program processing, so I use a dict with a two-dimensional list inside it to model your output, and finally easier to process.
import re
data_cleaned = {}
current_key = ''
action_flag = False
data_group = []
if_found_vlan = True
for line in open('./output.txt'):
m = re.match(r'(Test\d+)# sh run vlan \d+', line)
if m:
if not if_found_vlan:
data_cleaned[current_key].append([])
if_found_vlan = False
current_key = m.group(1)
if not data_cleaned.has_key(current_key):
data_cleaned[current_key] = []
continue
mm = re.match(r'vlan \d+', line)
if mm:
if_found_vlan = True
action_flag = True
data_group = []
if action_flag and '' == line.strip():
action_flag = False
data_cleaned[current_key].append(data_group)
if action_flag:
data_group.append(line.replace('\r', '').replace('\n', ''))
if not if_found_vlan:
data_cleaned[current_key].append([])
print ("+++++++++++++++++ The missing configuration is++++++++++++++\n")
p = [item for index, item in enumerate(data_cleaned['Test2']) if [] != [it for it in item if it not in data_cleaned['Test1'][index]]]
print('\n'.join(['\n'.join(item) for item in p]))
print ("+++++++++++++++++ The missing configuration is++++++++++++++\n")
q = [item for index, item in enumerate(data_cleaned['Test1']) if [] != [it for it in item if it not in data_cleaned['Test2'][index]]]
print('\n'.join(['\n'.join(item) for item in q]))

Code not recognizing attribute in SOAP response while attribute is being printed

I am working with ExactTarget FUEL SDK to retrieve data from the SalesForce Marketing Cloud. More specifically I working on calling "Unsub Events"(https://github.com/salesforce-marketingcloud/FuelSDK-Python/blob/master/objsamples/sample_unsubevent.py#L15) but the structure of the SOAP response has couple of deeper nested dictionary objects, which I need to iterate over and place into dataframes. Here is what the response looks like and I need to place each of the variables into seperate dataframe.
(UnsubEvent){
Client =
(ClientID){
ID = 11111111
}
PartnerKey = None
CreatedDate = 2016-07-13 13:37:46.000663
ModifiedDate = 2016-07-13 13:37:46.000663
ID = 11111111
ObjectID = "11111111"
SendID = 11111111
SubscriberKey = "aaa#aaa.com"
EventDate = 2016-07-13 13:37:46.000663
EventType = "Unsubscribe"
TriggeredSendDefinitionObjectID = None
BatchID = 1
List =
(List){
PartnerKey = None
ID = 11111111
ObjectID = None
Type = "aaaa"
ListClassification = "aaa"
}
IsMasterUnsubscribed = False
}]
I have successfully placed all variables in data frames except one "ListClassification". I am getting the error "List instance has no attribute 'ListClassification', my question is why is this happening if I can see the attribute in the response? and is there a fix for the issue?
My Code:
import ET_Client
import pandas as pd
try:
debug = False
stubObj = ET_Client.ET_Client(False, debug)
print '>>>UnsubEvents'
getUnsubEvent = ET_Client.ET_UnsubEvent()
getUnsubEvent.auth_stub = stubObj
getResponse3 = getUnsubEvent.get()
ResponseResultsUnsubEvent = getResponse3.results
#print ResponseResultsUnsubEvent
ClientIDUnsubEvents = []
partner_keys3 = []
created_dates3 = []
modified_date3 = []
ID3 = []
ObjectID3 = []
SendID3 = []
SubscriberKey3 = []
EventDate3 = []
EventType3 = []
TriggeredSendDefinitionObjectID3 = []
BatchID3 = []
IsMasterUnsubscribed = []
ListPartnerKey = []
ListID = []
ListObjectID = []
ListType = []
ListClassification = []
for UnsubEvent in ResponseResultsUnsubEvent:
ClientIDUnsubEvents.append(str(UnsubEvent['Client']['ID']))
partner_keys3.append(UnsubEvent['PartnerKey'])
created_dates3.append(UnsubEvent['CreatedDate'])
modified_date3.append(UnsubEvent['ModifiedDate'])
ID3.append(UnsubEvent['ID'])
ObjectID3.append(UnsubEvent['ObjectID'])
SendID3.append(UnsubEvent['SendID'])
SubscriberKey3.append(UnsubEvent['SubscriberKey'])
EventDate3.append(UnsubEvent['EventDate'])
EventType3.append(UnsubEvent['EventType'])
TriggeredSendDefinitionObjectID3.append(UnsubEvent['TriggeredSendDefinitionObjectID'])
BatchID3.append(UnsubEvent['BatchID'])
IsMasterUnsubscribed.append(UnsubEvent['IsMasterUnsubscribed'])
ListPartnerKey.append(str(UnsubEvent['List']['PartnerKey']))
ListID.append(str(UnsubEvent['List']['ID']))
ListObjectID.append(str(UnsubEvent['List']['ObjectID']))
ListType.append(str(UnsubEvent['List']['Type']))
ListClassification.append(str(UnsubEvent['List']['ListClassification']))
df3 = pd.DataFrame({'ListPartnerKey':ListPartnerKey,'ListID':ListID,'ListObjectID':ListObjectID,'ListType':ListType,
'ClientID':ClientIDUnsubEvents,'PartnerKey':partner_keys3,'CreatedDate':created_dates3,
'ModifiedDate':modified_date3,'ID':ID3,'ObjectID':ObjectID3,'SendID':SendID3,'SubscriberKey':SubscriberKey3,
'EventDate':EventDate3,'EventType':EventType3,'TriggeredSendDefinitionObjectID':TriggeredSendDefinitionObjectID3,
'BatchID':BatchID3,'ListClassification':ListClassification,'IsMasterUnsubscribed':IsMasterUnsubscribed})
print df3
Literally all other attributes are going into the dataframe but not sure why "ListClassification" is not being picked up on.
Thank you in advance for you help!

amazon api not able to find reduced price

I am using below code snippet to get the red reduced price from amazon but simehow I am always getting the old price and not the reduced red one.
enter code here`def getSignedUrlAmazon(searchvalue):
params = {'ResponseGroup':'Medium',
'AssociateTag':'',
'Operation':'ItemSearch',
'SearchIndex':'All',
'Keywords':searchvalue}
action = 'GET'`enter code here`
server = "webservices.amazon.in"
path = "/onca/xml"
params['Version'] = '2011-08-01'
params['AWSAccessKeyId'] = AWS_ACCESS_KEY_ID
params['Service'] = 'AWSECommerceService'
params['Timestamp'] = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())
# Now sort by keys and make the param string
key_values = [(urllib.quote(k), urllib.quote(v)) for k,v in params.items()]
key_values.sort()
# Combine key value pairs into a string.
paramstring = '&'.join(['%s=%s' % (k, v) for k, v in key_values])
urlstring = "http://" + server + path + "?" + \
('&'.join(['%s=%s' % (k, v) for k, v in key_values]))
# Add the method and path (always the same, how RESTy!) and get it ready to sign
hmac.update(action + "\n" + server + "\n" + path + "\n" + paramstring)
# Sign it up and make the url string
urlstring = urlstring + "&Signature="+\
urllib.quote(base64.encodestring(hmac.digest()).strip())
return urlstring
forgot the price grabbing part:
def getDataAmazon(searchvalue,PRICE, lock):
try:
searchvalue = removeStopWord(searchvalue)
url = getSignedUrlAmazon(searchvalue)
data = etree.parse(url)
root = objectify.fromstring(etree.tostring(data, pretty_print=True))
gd=[]
gd1=[]
counter = 0
if hasattr(root, 'Items') and hasattr(root.Items, 'Item'):
for item in root.Items.Item:
cd={}
priceCheckFlag = False
try :
if hasattr(item.ItemAttributes, 'EAN'):
cd['EAN'] = str(item.ItemAttributes.EAN)
elif hasattr(item, 'ASIN'):
cd['ASIN'] = str(item.ASIN)
cd['productName'] = str(item.ItemAttributes.Title)
if hasattr(item, 'SmallImage'):
cd['imgLink'] = str(item.SmallImage.URL)
elif hasattr(item, 'MediumImage'):
cd['imgLink'] = str(item.MediumImage.URL)
cd['numstar'] = "0"
cd['productdiv'] = '1'
cd['sellername'] = 'Amazon'
if hasattr(item, 'DetailPageURL'):
cd['visitstore'] = str(item.DetailPageURL)
if hasattr(item.ItemAttributes, 'ListPrice') and hasattr(item.ItemAttributes.ListPrice, 'Amount'):
cd['price'] = item.ItemAttributes.ListPrice.Amount/100.0
elif hasattr(item, 'OfferSummary') and hasattr(item.OfferSummary, 'LowestNewPrice'):
cd['price'] = item.OfferSummary.LowestNewPrice.Amount/100.0

Django Filter Loop OR

Does anyone know how I get Django Filter build an OR statement? I'm not sure if I have to use the Q object or not, I thought I need some type of OR pipe but this doesn't seem right:
filter_mfr_method = request.GET.getlist('filter_mfr_method')
for m in filter_mfr_method:
designs = designs.filter(Q(mfr_method = m) | m if m else '')
# Or should I do it this way?
#designs = designs.filter(mfr_method = m | m if m else '')
I would like this to be:
SELECT * FROM table WHERE mfr_method = 1 OR mfr_method = 2 OR mfr_method = 3
EDIT: Here is what Worked
filter_mfr_method = request.GET.getlist('filter_mfr_method')
list = []
for m in filter_mfr_method:
list.append(Q(mfr_method = m))
designs = designs.filter(reduce(operator.or_, list))
What about:
import operator
filter_mfr_method = request.GET.getlist('filter_mfr_method')
filter_params = reduce(operator.or_, filter_mfr_method, Q())
designs = designs.filter(filter_params)
Something I used before:
qry = None
for val in request.GET.getlist('filter_mfr_method'):
v = {'mfr_method': val}
q = Q(**v)
if qry:
qry = qry | q
else:
qry = q
designs = designs.filter(qry)
That is taken from one of my query builders.