Stuck in a loop

Author Message
jondank 10/04/2011 08:09 am
Oleg

During the latest download I watched as the queue slowly reduced and then after a considerable downtrend it turned back up again. I check the download folder and observed that no more files were actually being downloaded to my drive. Seems like it was stuck in a loop. I stopped it.

Any ideas?

My project profile below.


John

Stream 1.2 File
[Object]
OEVersion=Enterprise 6.0.0.3632
Type=0
IID=7020
Caption=http://oursource/sites/ESDITStrategyNew/IT/
URL=http://oursource/sites/ESDITStrategyNew/IT/
Lev=1000001
Weekday=257
User=xxxxxxxxx
Psw=xxxxxxxxxxx
LimBTSize=True
LimBTime=True
LimTSize=30000000
LimNumber=5000
LimTime=1000
FMGroup=2
LTMethod=1
LTNoFollow=True
pswMethod=1
FTText.Exts=htmlhtmaspaspxjspstmstmlidcshtmlhtxtxttextxspxmlrxmlcfmwmlphpphp3
FTImages.Exts=gifjpgjpegtiftiffxbmfifbmppngipxjp2j2cj2kwbmplwfwebp
FTVideo.Exts=mpgavianimpegmovflvfliflcvivrmramrvasfasxwmvm1vm2vvobsmilmp4m4v
FTAudio.Exts=wavriffmp3midmp2m3uravocwmaapeoggm4aaif
FTArchive.Exts=7zziparcgzzarjlhalayleirarcabtarpakacejarpdftgzexeiso
FTUDef.Exts=123axdclasscssdocdocmdocxdtdentgsagtajslwdlwpmdamdbmmfmpcmppmsgmswnsfpptpptxprnpstrtfssiswfvbsvsdwbkwkbwkswp5wpdxlmxlsxlsmxlsxxsl xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
FTText.B=xoxooo
FTImages.B=xoxooo
FTVideo.B=xoxooo
FTAudio.B=xoxooo
FTArchive.B=ooxooo
FTUDef.B=ooxooo
FTOther.B=xoxooo
FTSizes=0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,2,2,2,2,2,2
NotIgnoreLogout=False
RPathBx=1
RProt=255
LastStart=247:51:207:148:130:238:227:64:
LastEnd=215:125:139:151:137:238:227:64:
LastStarted=10/4/2011 1:56:09 AM
LastEnded=10/4/2011 7:11:38 AM
S200=308289
S400=1693
SAbr=152454
SPar=307026
SSav=2101
SLast=200
SSiz=912119601
SMdf=2101
SHTML=300220
SSuccDowns=1
LFiles=309009
LSize=13592571234
Stopped=True
Flags=1
CFFlags=64
ImgDim=0,0,0,0
PrevURL=http://oursource/sites/ESDITStrategyNew/IT/
ExploreDirs=True
LIndexed=False
IndexFiles=False


Oleg Chernavin 10/04/2011 08:11 am
Did you monitor the Queue tab to understand if there are strange URLs? Please also try to decrease number of attempts in the Internet tab of the Ribbon.

Best regards,
Oleg Chernavin
MP Staff
jondank 10/04/2011 08:25 am
Oleg

I did not notice any unusually strange URLs but of course usually many of the URLs are strange. That is why I wanted to eliminate all the web junk DNA.

I will try your idea re attempts. I don't think it is an userid/password issue as I have ot set to prompt me. I could be havng an attempts issue in terms of contention at the url. I think I saw that happen once.

John
Oleg Chernavin 10/04/2011 08:28 am
I understand. I don't have other ideas, because I even can't reproduce this without access to the site.

Oleg.
jondank 10/04/2011 08:30 pm
Oleg

I reduced the number of retrues to 2 and I restarted the download of missing files. I noticed that while it is not actually copying the unwanted files (such as aspx) it is reporting that they are being downloaded. This seems inefficient.

John
jondank 10/04/2011 11:41 pm
Oleg

Reducing the number of retries did not fix the problem. I watched it loop over allf the lowest level folders alphabetically. It processed A thrus Z and then started at A again. It keeps looping like this. I'm stopping it and moving to the next project.

Would like to help you debug this. I send you my project properties. Hopefully you can see that I have set it to load only from startining directory. I did this in each of Text, Images, Video, Audio, Archive, User Defined and Other/ I only have Archive and User Defined checked.

There is no depth limit set.

What does the Enable download directory check box do in the Parsing screen? I have run it with and without the box checked and it seems to have no effect.

I have Check files integrity, Explore all possible subdirectories and Supress Web site errors also checked in the Parsing screen.

John
jondank 10/04/2011 11:59 pm
Oleg

I unchecked the integrity check and explore all subdirectories. Files in queue started to drop. I stopped current run and restarted.

John
Oleg Chernavin 10/05/2011 02:17 am
I see now. It would be great to reproduce and improve that check.

Is it possible to give me access to the site and let me know several direct URLs that give that effect?

I could write you directly to the E-mail you use in this forum.

Thank you!

Oleg.
jondank 10/05/2011 09:15 pm
Oleg

I would not be able to give you access. I access my client's site thry VPN.

You cna email me privately.

Not sure how I could technically let you see the screen.

What city are you in?

John
Oleg Chernavin 10/06/2011 06:19 am
OK. I just sent you an E-mail. Thank you!

Oleg.
mkirouac 12/28/2011 02:19 pm
Hi Oleg,

I'm having the same problems where indexing a SharePoint site gets caught in an infinite loop and never quits. If I set a level restriction, even a huge one it will finish but doesn't pull all the documents. A level of 5 should be more than enough to find all documents but even at 100 it doesn't. So I am trying to get out of this infinite look by adding entries to the filename filter to exclude the troublesome URLs but this isn't having any effect. URL filter doesn't see to work either.

http://oursource/sites/ESDITStrategyNew/EntDiv/wd/Lists/Links/DispForm.aspx?ID=3

If I have a URL like the above, I would expect at least one of the filters below to prevent it from downloading but the Test button always says "The URL will be downloaded".

Links/DispForm.aspx
/Links/DispForm.aspx
*/Links/DispForm.aspx*
*/Links/DisplForm.aspx?*

Under URL Filter - filename, I have the below under exclude keywords.

addwrkfl.aspx
subnew.aspx
userdisplay.aspx
people.aspx
qlreord.aspx
lstsetng.aspx
settings.aspx
viewnew.aspx
listfeed.aspx
versiondiff.aspx
upload.aspx
/links/dispform.aspx
/calendar/dispform.aspx

All the documents I care about are under "Shared Documents" so I can safely exclude a number of other URLs but it doesn't seem to work the way I expect.

I have done everything listed above in this thread but it did not help.

Suggestions?
Oleg Chernavin 12/28/2011 04:18 pm
Can we have a remote support session, so I could take a look at the site (I assume, it is not available online)?

Please contact me via support@metaproducts.com and let's schedule the time.

Oleg.