Normandphiliperna – WPmines – Find your next dream job

380
4
0

RE : How to parse invalid (bad / not well-formed) XML?

Asked on July 17, 2020 in XML.

That "XML" is worse than invalid – it’s not well-formed; see Well Formed vs Valid XML.

An informal assessment of the predictability of the transgressions does not help. That textual data is not XML. No conformant XML tools or libraries can help you process it.

Options, most desirable first:
Have the provider fix the problem on their end. Demand well-formed XML. (Technically the phrase well-formed XML is redundant but may be useful for emphasis.)
Use a tolerant markup parser to cleanup the problem ahead of parsing as XML:
Standalone: xmlstarlet has robust recovering and repair capabilities^{_{credit: RomanPerekhrest}}
xmlstarlet fo -o -R -H -D bad.xml 2>/dev/null 
Standalone and C/C++: HTML Tidy works with XML too. Taggle is a port of TagSoup to C++.

Python: Beautiful Soup is Python-based. See notes in the Differences between parsers section. See also answers to this question for more suggestions for dealing with not-well-formed markup in Python, including especially lxml’s recover=True option. See also this answer for how to use codecs.EncodedFile() to cleanup illegal characters.

Java: TagSoup and JSoup focus on HTML. FilterInputStream can be used for preprocessing cleanup.

.NET:

XmlReaderSettings.CheckCharacters can be disabled to get past illegal XML character problems.

@jdweng notes that XmlReaderSettings.ConformanceLevel can be set to ConformanceLevel.Fragment so that XmlReader can read XML Well-Formed Parsed Entities lacking a root element.

@jdweng also reports that XmlReader.ReadToFollowing() can sometimes be used to work-around XML syntactical issues, but note rule-breaking warning in #3 below.

Microsoft.Language.Xml.XMLParser is said to be “error-tolerant”.

PHP: See DOMDocument::$recover and libxml_use_internal_errors(true). See nice example here.

Ruby: Nokogiri supports “Gentle Well-Formedness”.

R: See htmlTreeParse() for fault-tolerant markup parsing in R.

Perl: See XML::Liberal, a "super liberal XML parser that parses broken XML."
Process the data as text manually using a text editor or programmatically using character/string functions. Doing this programmatically can range from tricky to impossible as what appears to be predictable often is not — rule breaking is rarely bound by rules.
For invalid character errors, use regex to remove/replace invalid characters:

PHP: preg_replace('/[^\x{0009}\x{000a}\x{000d}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}]+/u', ' ', $s); – Ruby: string.tr("^\u{0009}\u{000a}\u{000d}\u{0020}-\u{D7FF}\u{E000‌}-\u{FFFD}", ' ') – JavaScript: inputStr.replace(/[^\x09\x0A\x0D\x20-\xFF\x85\xA0-\uD7FF\uE000-\uFDCF\uFDE0-\uFFFD]/gm, '')
For ampersands, use regex to replace matches with &:^{_{credit: blhsin, demo}}
&(?!(?:#\d+|#x[0-9a-f]+|\w+);) 
Note that the above regular expressions won’t take comments or CDATA sections into account.

0
2
0

RE : net.sf.saxon.s9api.SaxonApiException: I/O error reported by XML parser processing null: null

Asked on July 17, 2020 in XML.

ClassLoader.getResourceAsStream() returns null if the resource cannot be found, and that’s what the stacktrace suggests is happening here.

359
1
0

RE : CancellationTokenSource vs. volatile boolean

Asked on July 16, 2020 in .NET.

Of course yes. There are many. I’ll list few.

CancellationToken supports callbacks. You can be notified when the cancellation is requested.

CancellationToken supports WaitHandle which you could wait for indefinitely or with a timeout.

You can schedule the cancelation of CancellationToken using CancellationTokenSource.CancelAfter method.

You can link your CancellationToken to another, so that when one is cancelled another can be considered as cancelled.

By Task if you mean System.Threading.Tasks.Task a volatile boolean cannot transition the state of the Task to cancelled but CancellationToken can.

343
1
0

RE : Creating Groups for Xcode Templates

Asked on July 16, 2020 in Swift.

Ok so I guess this is quite obscure Xcode stuff but I’ve worked out how to fix this if anyone is interested.

If you provide a folder for the Nodes you want to put in a Group like this:
<key>Nodes</key>     <array>         <string>Engine/Camera.swift</string>         <string>Engine/Node.swift</string>         <string>Engine/Renderer.swift</string>         <string>Utility/MathLibrary.swift</string>         <string>Shaders/Shaders.metal</string> 
Then you can go ahead and place them in a Group using the method described in the original post:
<key>Definitions</key>     <dict>         <key>Utility/MathLibrary.swift</key>         <dict>             <key>Group</key>             <array>                 <string>Utility</string>             </array>             <key>Path</key>             <string>MathLibrary.swift</string>         </dict> 
This is NOT what is described in any of the articles/tutorials I have looked at, for example:

https://www.hackingwithswift.com/articles/158/how-to-create-a-custom-xcode-template-for-coordinators

https://www.telerik.com/blogs/how-to-create-custom-project-templates-in-xcode-7

These both say to arrange your nodes like this:
<key>Nodes</key>     <array>         <string>Camera.swift</string> 
Which will not work if you want to use Groups.

317
1
0

RE : Prevent task from starting when added to a concurrent queue

Asked on July 16, 2020 in .NET.

It seems to me that you’re over-complicating things here.

I’m not sure you need any queue at all.

This approach allows the network access work to occur concurrently:

using var semaphore = new SemaphoreSlim(1); var tasks = pieces.Select(piece => {     var request = new HttpRequestMessage { RequestUri = new Uri(url) };     request.Headers.Range = new RangeHeaderValue(piece.start, piece.end);      //Send the request     var download = await client.SendAsync(request);      //Use interlocked to increment Tasks done by one     Interlocked.Increment(ref OctaneEngine.TasksDone);      var stream = await download.Content.ReadAsStreamAsync();          using (var fs = new FileStream(piece._tempfilename, FileMode.OpenOrCreate,         FileAccess.Write))     {         await semaphore.WaitAsync(); // Only allow one concurrent file write         await stream.CopyToAsync(fs);     }      semaphore.Release();      Interlocked.Increment(ref TasksDone); });  await Task.WhenAll(tasks);

Because your work involves I/O, there is little benefit trying to use multiple threads via Parallel.ForEach; awaiting the async methods will release threads to process other requests.

887
28
0

RE : How can I prevent SQL injection in PHP?

Asked on July 16, 2020 in Mysql.

A few guidelines for escaping special characters in SQL statements.

Don’t use MySQL. This extension is deprecated. Use MySQLi or PDO instead.

MySQLi

For manually escaping special characters in a string you can use the mysqli_real_escape_string function. The function will not work properly unless the correct character set is set with mysqli_set_charset.

Example:
$mysqli = new mysqli('host', 'user', 'password', 'database'); $mysqli->set_charset('charset');  $string = $mysqli->real_escape_string($string); $mysqli->query("INSERT INTO table (column) VALUES ('$string')"); 
For automatic escaping of values with prepared statements, use mysqli_prepare, and mysqli_stmt_bind_param where types for the corresponding bind variables must be provided for an appropriate conversion:

Example:
$stmt = $mysqli->prepare("INSERT INTO table (column1, column2) VALUES (?,?)");  $stmt->bind_param("is", $integer, $string);  $stmt->execute(); 
No matter if you use prepared statements or mysqli_real_escape_string, you always have to know the type of input data you’re working with.

So if you use a prepared statement, you must specify the types of the variables for mysqli_stmt_bind_param function.

And the use of mysqli_real_escape_string is for, as the name says, escaping special characters in a string, so it will not make integers safe. The purpose of this function is to prevent breaking the strings in SQL statements, and the damage to the database that it could cause. mysqli_real_escape_string is a useful function when used properly, especially when combined with sprintf.

Example:
$string = "x' OR name LIKE '%John%"; $integer = '5 OR id != 0';  $query = sprintf( "SELECT id, email, pass, name FROM members WHERE email ='%s' AND id = %d", $mysqli->real_escape_string($string), $integer);  echo $query; // SELECT id, email, pass, name FROM members WHERE email ='x\' OR name LIKE \'%John%' AND id = 5  $integer = '99999999999999999999'; $query = sprintf("SELECT id, email, pass, name FROM members WHERE email ='%s' AND id = %d", $mysqli->real_escape_string($string), $integer);  echo $query; // SELECT id, email, pass, name FROM members WHERE email ='x\' OR name LIKE \'%John%' AND id = 2147483647 

299
3
0

RE : How can I select all columns from a subquery that only contains some?

Asked on July 16, 2020 in Mysql.

One approach can be using temporary tables:

   with max_time as    (      select id,MAX(grab_time) as grab_time FROM events where ended='false' GROUP BY       id    )    select * from events e,max_time mt where e.id=mt.id and e.grab_time=mt.grab_time;

269
1
0

RE : SQL numbering merge if string is the same

Asked on July 16, 2020 in Mysql.

There is no need to aggregate twice or more.
Use UNION ALL to get all the rows from the 2 tables and then aggregate:
SELECT t.dj_1, COUNT(*) AS uren  FROM (   SELECT dj_1 FROM dj_rooster    WHERE week = '28'    UNION ALL    SELECT dj_1 FROM djpaneel_shows_uren    WHERE week = '28' AND active = '1'  ) t GROUP BY t.dj_1  ORDER BY uren DESC 

0
1
0

RE : microbit Python: When using time module, how to keep sleep() units in milliseconds?

Asked on July 16, 2020 in Python.

Use from ... import ... as notation.
from microbit import sleep as microbit_sleep from time import sleep as normal_sleep  microbit_sleep(1000) # sleeps for one second normal_sleep(1000) # sleeps for much longer 
Or, if you need everything in those two modules, just do a normal import.
import microbit import time  microbit.sleep(1000) time.sleep(1) 
from ... import * is generally considered bad Python style precisely for the reasons you’ve discovered here. It’s okay for really quick scripts, but best avoided as projects get larger and depend on more modules.

319
1
0

RE : Python PIL ImageDraw.polygon not drawing the polygon if it is a line or a point

Asked on July 16, 2020 in Python.

While this does not explain the behaviour of the .polygon method, I have found a workaround.

By combining .polygon with ~~.point~~ .line you will get the desired result. (See edit below on why to use .line over .point)

#1 Point

Python 3.7.6 (v3.7.6:43364a7ae0, Dec 18 2019, 14:18:50)  [Clang 6.0 (clang-600.0.57)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from PIL import Image, ImageDraw >>> import numpy as np >>> img = Image.new('L', (5, 5), 0) >>> ImageDraw.Draw(img).polygon([2, 2, 2, 2, 2, 2, 2, 2], outline=1, fill=1) >>> ImageDraw.Draw(img).line([2, 2, 2, 2, 2, 2, 2, 2], fill=1) >>> np.array(img) array([[0, 0, 0, 0, 0],        [0, 0, 0, 0, 0],        [0, 0, 1, 0, 0],        [0, 0, 0, 0, 0],        [0, 0, 0, 0, 0]], dtype=uint8)

#2 Line

>>> img = Image.new('L', (5, 5), 0) >>> ImageDraw.Draw(img).polygon([1, 1, 2, 1, 2, 1, 1, 1], outline=1, fill=1) >>> ImageDraw.Draw(img).line([1, 1, 2, 1, 2, 1, 1, 1], fill=1) >>> np.array(img) array([[0, 0, 0, 0, 0],        [0, 1, 1, 0, 0],        [0, 0, 0, 0, 0],        [0, 0, 0, 0, 0],        [0, 0, 0, 0, 0]], dtype=uint8)

#3 Rectangle

>>> img = Image.new('L', (5, 5), 0) >>> ImageDraw.Draw(img).polygon([1, 1, 2, 1, 2, 2, 1, 2], outline=1, fill=1) >>> ImageDraw.Draw(img).line([1, 1, 2, 1, 2, 2, 1, 2], fill=1) >>> np.array(img) array([[0, 0, 0, 0, 0],        [0, 1, 1, 0, 0],        [0, 1, 1, 0, 0],        [0, 0, 0, 0, 0],        [0, 0, 0, 0, 0]], dtype=uint8)

EDIT: To make this also work for longer lines (see below) .line has to be used instead of .point.

With .point

>>> img = Image.new('L', (5, 5), 0) >>> ImageDraw.Draw(img).polygon([1, 1, 3, 1, 3, 1, 1, 1], outline=1, fill=1) >>> ImageDraw.Draw(img).point([1, 1, 3, 1, 3, 1, 1, 1], fill=1) >>> np.array(img) array([[0, 0, 0, 0, 0],        [0, 1, 0, 1, 0],        [0, 0, 0, 0, 0],        [0, 0, 0, 0, 0],        [0, 0, 0, 0, 0]], dtype=uint8)

With .line

>>> img = Image.new('L', (5, 5), 0) >>> ImageDraw.Draw(img).polygon([1, 1, 3, 1, 3, 1, 1, 1], outline=1, fill=1) >>> ImageDraw.Draw(img).line([1, 1, 3, 1, 3, 1, 1, 1], fill=1) >>> np.array(img) array([[0, 0, 0, 0, 0],        [0, 1, 1, 1, 0],        [0, 0, 0, 0, 0],        [0, 0, 0, 0, 0],        [0, 0, 0, 0, 0]], dtype=uint8)