Jump to content

Parsing strings...


SoLoR

Recommended Posts

OK im trying to make some parsing magic. I have strings like this:

    hfList = [{fixid:'410273',product:'.NET Framework 3.0 - Win7, Windows Server 2008 R2 \x28CBS\x29',language:'All \x28Global\x29',langcode:'intl',platform:'x86',release:'sp2',filename:'DevDiv879341',version:'NDP 3.0',build:'30729.5020',size:'4382033',credate:'3\x2f31\x2f2010 10\x3a41\x3a36 PM',moddate:'3\x2f31\x2f2010 10\x3a41\x3a36 PM'},{fixid:'410225',product:'.NET Framework 3.0 - Windows Server 2003, Windows XP \x28MSI\x29',language:'All \x28Global\x29',langcode:'intl',platform:'x86',release:'sp2',filename:'DevDiv866275',version:'NDP 3.0',build:'30729.4519',size:'8630148',credate:'3\x2f29\x2f2010 10\x3a55\x3a52 PM',moddate:'3\x2f29\x2f2010 10\x3a55\x3a52 PM'},{fixid:'410274',product:'.NET Framework 3.0 - Win7, Windows Server 2008 R2 \x28CBS\x29',language:'All \x28Global\x29',langcode:'intl',platform:'x64',release:'sp2',filename:'DevDiv879341',version:'NDP 3.0',build:'30729.5020',size:'6050502',credate:'3\x2f31\x2f2010 10\x3a42\x3a38 PM',moddate:'3\x2f31\x2f2010 10\x3a42\x3a38 PM'},{fixid:'410226',product:'.NET Framework 3.0 - Windows Server 2003, Windows XP \x28MSI\x29',language:'All \x28Global\x29',langcode:'intl',platform:'x64',release:'sp2',filename:'DevDiv866275',version:'NDP 3.0',build:'30729.4519',size:'14043927',credate:'3\x2f29\x2f2010 10\x3a57\x3a31 PM',moddate:'3\x2f29\x2f2010 10\x3a57\x3a31 PM'}];

Its obvious what i want here :) i want to make some kind of parsing algorithem (in perl i guess, so i can use it on my linux box), that would extract data i want, like product, platform, date etc etc... This is for parsing MS hotfix database. I currently have this as shell script:

#!/bin/sh

i=981107
while [ $i -le 981110 ]
do
wget --cookies=on --load-cookies=cookie.txt http://support.microsoft.com/hotfix/KBHotfix.aspx?kbnum=${i}&kbln=en-us
wait
cat KBHotfix.aspx?kbnum=${i} | grep hfList > ${i}
wait
rm KBHotfix.aspx?kbnum=${i}
if [ ! -s ${i} ]; then
rm ${i}
fi
(( i++ ))
done

This gives me files (with KB number) with above string. I just need to parse them properly to make database/table from all the data. Soo any ideas ? :) eventualy i will try to make, to send me notice what updates are new etc and ofc put it in crontab to go thru MS hotfix database once or twice/week. But first thing is parsing thingie... if ill need to google and trial and error method, it will take loong time.

Edited by SoLoR
Link to comment
Share on other sites

Legolash2o (creator of Win7toolkit) is great at this stuff, I recall he made one that went to that exact URL and checked if the hotfix was for Vista SP1

I have also requested a tool that would do exactly that...also differenciate if the hotfix is for Windows 7, Vista, XP, SP?, etc.

Link to comment
Share on other sites

Yes ofc, goal is to create one big database/table with everything and you can just filter then what you want/need. For start i just want to do simple table seperated with tabs, so i can import it to excel and sort by date. For example goal is something like this: http://www16.atpages.jp/~hotfix/ just first of all with all fixes, preferably as mysql database and custom filters.... oh well somehow i feel with my current knowlage in coding ill be doing this **** for few months, since i need to learn basicaly from scratch everything :)

Link to comment
Share on other sites

lol thats what i created Windows Updater for, everytime u released an update i would add it to the SQL database and whoever did not have it installed would get get showed the new updates. If i can help, let me know though. I use VB .NET :D

@Ricktendo64

I remember that program now, it took forever to go through all KB articles though :P

Edited by Legolash2o
Link to comment
Share on other sites

Why im doing this is because KB hotfixes are posted sooner then KB articles are created. For now i found "easy way out". Parsing articles with "Win7" or "Windows 7" in one directory and then diff it with "old" one. This does not give me database, but i do know what is new and whats not :) Im currently in middle of parsing hotfixes from 970000 to 983000 to see how long it will take. I know this does not include all but will 99.9% include all new ones and if i miss one every full moon then so be it. Its doing with aprox 1 article/sec so its a bit under 4h for whole database. What ill do is crontab it twice/week and then send me diffs to my mail... Thing is that "big" project i had in mind, we should have full 24/7 server and everything. Doing something this size, for someone runing on his home computer seems kinda pointless.

But ye general thing is you download hotfix download page (you need to provide coockie where you agree with that license agreement), then you search for hfList variable inside, if it exists it means that article have public hotfixes and everything that is needed is in that hfList, like OS/dates/architecture etc etc.

Edited by SoLoR
Link to comment
Share on other sites

04.04.2010  13:55            16.128 970413
04.04.2010 13:55 1.625 970486
04.04.2010 13:55 2.930 970924
04.04.2010 13:55 1.480 971601
04.04.2010 13:55 2.918 971988
04.04.2010 13:55 2.137 972251
04.04.2010 13:55 1.414 972551
04.04.2010 13:55 2.225 972583
04.04.2010 13:55 591 972831
04.04.2010 13:55 2.840 972848
04.04.2010 13:55 855 973062
04.04.2010 13:55 1.669 973529
04.04.2010 13:55 2.914 973746
04.04.2010 13:55 882 973751
04.04.2010 13:55 1.996 973975
04.04.2010 13:55 593 974039
04.04.2010 13:55 3.014 974065
04.04.2010 13:55 567 974090
04.04.2010 13:55 3.012 974168
04.04.2010 13:55 16.529 974176
04.04.2010 13:55 590 974204
04.04.2010 13:55 587 974259
04.04.2010 13:55 1.129 974324
04.04.2010 13:55 3.006 974372
04.04.2010 13:55 871 974476
04.04.2010 13:55 1.986 974477
04.04.2010 13:55 876 974560
04.04.2010 13:55 589 974598
04.04.2010 13:55 879 974624
04.04.2010 13:55 304 974636
04.04.2010 13:55 594 974638
04.04.2010 13:55 873 974639
04.04.2010 13:55 303 974672
04.04.2010 13:55 910 974674
04.04.2010 13:55 302 974719
04.04.2010 13:55 304 974909
04.04.2010 13:55 587 974912
04.04.2010 13:55 592 974930
04.04.2010 13:55 589 974940
04.04.2010 13:55 589 974944
04.04.2010 13:55 589 975053
04.04.2010 13:55 2.900 975076
04.04.2010 13:55 836 975142
04.04.2010 13:55 882 975214
04.04.2010 13:55 867 975243
04.04.2010 13:55 1.672 975332
04.04.2010 13:55 306 975351
04.04.2010 13:55 1.161 975354
04.04.2010 13:55 301 975360
04.04.2010 13:55 1.690 975363
04.04.2010 13:55 1.680 975415
04.04.2010 13:55 591 975469
04.04.2010 13:55 879 975496
04.04.2010 13:55 589 975499
04.04.2010 13:55 885 975500
04.04.2010 13:55 1.666 975512
04.04.2010 13:55 304 975530
04.04.2010 13:55 589 975535
04.04.2010 13:55 593 975538
04.04.2010 13:55 873 975599
04.04.2010 13:55 879 975617
04.04.2010 13:55 1.732 975619
04.04.2010 13:55 872 975620
04.04.2010 13:55 1.679 975680
04.04.2010 13:55 873 975688
04.04.2010 13:55 879 975741
04.04.2010 13:55 587 975762
04.04.2010 13:55 305 975763
04.04.2010 13:55 879 975777
04.04.2010 13:55 879 975778
04.04.2010 13:55 591 975806
04.04.2010 13:55 595 975851
04.04.2010 13:55 873 975858
04.04.2010 13:55 873 975921
04.04.2010 13:55 2.942 975954
04.04.2010 13:55 881 975992
04.04.2010 13:55 1.137 976038
04.04.2010 13:55 873 976090
04.04.2010 13:55 305 976093
04.04.2010 13:55 306 976096
04.04.2010 13:55 879 976099
04.04.2010 13:55 1.990 976117
04.04.2010 13:55 879 976187
04.04.2010 13:55 587 976210
04.04.2010 13:55 1.690 976240
04.04.2010 13:55 879 976296
04.04.2010 13:55 1.393 976373
04.04.2010 13:55 879 976398
04.04.2010 13:55 879 976399
04.04.2010 13:55 597 976417
04.04.2010 13:55 873 976418
04.04.2010 13:55 882 976419
04.04.2010 13:55 879 976422
04.04.2010 13:55 1.181 976425
04.04.2010 13:55 879 976427
04.04.2010 13:55 1.690 976438
04.04.2010 13:55 882 976443
04.04.2010 13:55 996 976462
04.04.2010 13:55 595 976483
04.04.2010 13:55 306 976484
04.04.2010 13:55 888 976494
04.04.2010 13:55 881 976525
04.04.2010 13:55 875 976527
04.04.2010 13:55 595 976528
04.04.2010 13:55 888 976571
04.04.2010 13:55 874 976586
04.04.2010 13:55 595 976627
04.04.2010 13:55 304 976655
04.04.2010 13:55 881 976658
04.04.2010 13:55 879 976746
04.04.2010 13:55 885 976755
04.04.2010 13:55 1.678 976759
04.04.2010 13:55 593 976781
04.04.2010 13:55 591 976782
04.04.2010 13:55 879 976883
04.04.2010 13:55 882 976887
04.04.2010 13:55 3.025 976898
04.04.2010 13:55 880 976910
04.04.2010 13:55 593 976972
04.04.2010 13:55 879 977015
04.04.2010 13:55 999 977020
04.04.2010 13:55 595 977036
04.04.2010 13:55 881 977067
04.04.2010 13:55 304 977068
04.04.2010 13:55 2.664 977069
04.04.2010 13:55 597 977071
04.04.2010 13:55 873 977096
04.04.2010 13:55 885 977132
04.04.2010 13:55 881 977158
04.04.2010 13:55 1.735 977178
04.04.2010 13:55 880 977180
04.04.2010 13:55 1.998 977182
04.04.2010 13:55 303 977184
04.04.2010 13:55 879 977186
04.04.2010 13:55 873 977222
04.04.2010 13:55 885 977307
04.04.2010 13:55 599 977314
04.04.2010 13:55 879 977342
04.04.2010 13:55 877 977346
04.04.2010 13:55 879 977348
04.04.2010 13:55 873 977353
04.04.2010 13:55 880 977357
04.04.2010 13:55 888 977375
04.04.2010 13:55 876 977380
04.04.2010 13:55 9.619 977381
04.04.2010 13:55 886 977392
04.04.2010 13:55 595 977397
04.04.2010 13:55 1.200 977413
04.04.2010 13:55 593 977415
04.04.2010 13:55 590 977417
04.04.2010 13:55 882 977419
04.04.2010 13:55 1.002 977420
04.04.2010 13:55 885 977542
04.04.2010 13:55 306 977570
04.04.2010 13:55 879 977579
04.04.2010 13:55 886 977608
04.04.2010 13:55 591 977609
04.04.2010 13:55 12.556 977615
04.04.2010 13:55 595 977617
04.04.2010 13:55 591 977620
04.04.2010 13:55 887 977627
04.04.2010 13:55 589 977632
04.04.2010 13:55 1.998 977643
04.04.2010 13:55 885 977648
04.04.2010 13:55 304 977692
04.04.2010 13:55 877 977695
04.04.2010 13:55 16.568 977748
04.04.2010 13:55 873 977787
04.04.2010 13:55 595 977832
04.04.2010 13:55 1.008 977866
04.04.2010 13:55 568 977894
04.04.2010 13:55 591 977911
04.04.2010 13:55 883 977944
04.04.2010 13:55 886 977977
04.04.2010 13:55 867 978000
04.04.2010 13:55 594 978034
04.04.2010 13:55 1.676 978042
04.04.2010 13:55 874 978048
04.04.2010 13:55 304 978055
04.04.2010 13:55 874 978116
04.04.2010 13:55 589 978118
04.04.2010 13:55 16.421 978125
04.04.2010 13:55 10.492 978155
04.04.2010 13:55 1.672 978157
04.04.2010 13:55 589 978205
04.04.2010 13:55 587 978206
04.04.2010 13:55 3.032 978249
04.04.2010 13:55 3.032 978254
04.04.2010 13:55 587 978258
04.04.2010 13:55 591 978265
04.04.2010 13:55 874 978277
04.04.2010 13:55 873 978330
04.04.2010 13:55 885 978347
04.04.2010 13:55 591 978387
04.04.2010 13:55 883 978433
04.04.2010 13:55 1.991 978454
04.04.2010 13:55 879 978462
04.04.2010 13:55 592 978476
04.04.2010 13:55 1.678 978491
04.04.2010 13:55 867 978500
04.04.2010 13:55 874 978516
04.04.2010 13:55 2.328 978520
04.04.2010 13:55 873 978526
04.04.2010 13:55 588 978527
04.04.2010 13:55 595 978529
04.04.2010 13:55 879 978535
04.04.2010 13:55 592 978562
04.04.2010 13:55 873 978571
04.04.2010 13:55 589 978632
04.04.2010 13:55 873 978714
04.04.2010 13:55 873 978738
04.04.2010 13:55 867 978837
04.04.2010 13:55 879 978838
04.04.2010 13:55 881 978869
04.04.2010 13:55 873 978884
04.04.2010 13:55 870 978917
04.04.2010 13:55 873 978918
04.04.2010 13:55 588 978943
04.04.2010 13:55 874 978977
04.04.2010 13:55 996 978981
04.04.2010 13:55 873 978982
04.04.2010 13:55 873 979039
04.04.2010 13:55 1.686 979101
04.04.2010 13:55 879 979140
04.04.2010 13:55 875 979155
04.04.2010 13:55 303 979214
04.04.2010 13:55 867 979223
04.04.2010 13:55 589 979224
04.04.2010 13:55 1.670 979241
04.04.2010 13:55 587 979294
04.04.2010 13:55 880 979350
04.04.2010 13:55 591 979366
04.04.2010 13:55 587 979373
04.04.2010 13:55 588 979374
04.04.2010 13:55 879 979383
04.04.2010 13:55 876 979425
04.04.2010 13:55 876 979443
04.04.2010 13:55 882 979444
04.04.2010 13:55 873 979470
04.04.2010 13:55 879 979491
04.04.2010 13:55 873 979495
04.04.2010 13:55 303 979524
04.04.2010 13:55 873 979530
04.04.2010 13:55 879 979532
04.04.2010 13:55 2.002 979533
04.04.2010 13:55 867 979538
04.04.2010 13:55 873 979543
04.04.2010 13:55 1.127 979548
04.04.2010 13:55 2.007 979562
04.04.2010 13:55 302 979564
04.04.2010 13:55 587 979567
04.04.2010 13:55 876 979580
04.04.2010 13:55 873 979619
04.04.2010 13:55 873 979643
04.04.2010 13:55 588 979681
04.04.2010 13:55 587 979711
04.04.2010 13:55 3.028 979744
04.04.2010 13:55 587 979745
04.04.2010 13:55 878 979751
04.04.2010 13:55 879 979791
04.04.2010 13:55 882 979903
04.04.2010 13:55 2.001 979917
04.04.2010 13:55 3.097 980077
04.04.2010 13:55 871 980120
04.04.2010 13:55 2.007 980138
04.04.2010 13:55 1.684 980226
04.04.2010 13:55 2.010 980251
04.04.2010 13:55 583 980295
04.04.2010 13:55 1.182 980333
04.04.2010 13:55 303 980358
04.04.2010 13:55 1.668 980368
04.04.2010 13:55 876 980396
04.04.2010 13:55 1.681 980423
04.04.2010 13:55 303 980598
04.04.2010 13:55 875 980681
04.04.2010 13:55 302 980856
04.04.2010 13:55 869 980922
04.04.2010 13:55 2.656 980951
04.04.2010 13:55 587 980959
04.04.2010 13:55 996 981002
04.04.2010 13:55 1.336 981107
04.04.2010 13:55 583 981112
04.04.2010 13:55 999 981119
04.04.2010 13:55 16.297 981128
04.04.2010 13:55 585 981129
04.04.2010 13:55 585 981130
04.04.2010 13:55 1.394 981194
04.04.2010 13:55 583 981214
04.04.2010 13:55 1.992 981286
04.04.2010 13:55 1.994 981303
04.04.2010 13:55 304 981618
04.04.2010 13:55 587 981813
04.04.2010 13:55 881 981877
293 File(s)

here is everything avilable in MS database "on request" from 970000-983000 regarding Windows 7. Before you ask, ye some updates are duplicated and things like that...

Edited by SoLoR
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...