Sharing with you guys a very useful tool/browser extension, but there is an issue with it

See tool:
image

I will first give a little background to this tool and then at the end explain the two problems with it.
This tool is very unique and unlike any other I've come across. It allows you to highlight any text in your Chromium web browser, and then quickly search it in any search engine of your choice. Little bubble icons popup automatically which shorten the time it takes to search the selected text to within milliseconds, rather than using a right click context menu,etc.

The search engines I use are:
A.) Google
B.)www.northboot.xyz
C.) duckduckgo
D.) Yandex
E.) Bing

It might've had some malicious stuff in its source code, i discovered it on google's extension webstore. It's no longer available. I've caught browser extensions in google webstore in the past having malicious code in them.

Here is it's crx: Select-Search,bubble search engine,am using,webstore.crx | Files.fm.

Thats it's source code. A weird problem I've been having since, I guess, I started using Zorin is that when I change the .crx to a .zip file to unzip/extract it's source code. It doesnt work. I used to do this no problem with Windows 7. For Zorin I end up having to use WinRAR.
Why cant Zorin's native extractor work with this?

Next, the reason why I need to look at its source code is because it might be rerouting my searches and logging them. I think my ublock origin caught it trying to send my searches to the URL: https://go.skimresources.com/?id=31959X1726082&isjs=1&jv=15.6.0&sref=https

Once I did extract the source code I tried to search for any IP address, or http or URL of any kind, but I couldnt find anything. However I'm not sure I wrote my grep -r -d recurse command chain correctly to search all of that properly.

So, my issues are: 1.) Why cant Zorin natively unzip a .zip file?

And 2.) How do I create a command chain that searches the unzipped contents for any kind of IP address/URL, or search for anyway that it might be phoning home info maliciously.

-Thanks.

A .crx is not exactly the same as a .zip file, it contains additional information that the default extractor doesn't expect so it fails to decompress. Using the terminal, the command unzip will work fine.

A preliminary check can look something like this:

grep -Poi 'https?[^\s]*[^.,;]' *.{html,css,js}

This will check for any URLs inside all .html, .css and .js files. From there, I would suggest going slowly over any matching files and examine the context. You can switch the matching pattern to something like '(\d{1,3}\.){3}\d+' to search for IPv4 addresses.

But the better answer to this is... if you don't trust this code and suspect malicious activity is involved, don't use it. Finding malicious code that doesn't want to be found can be trickier than it seems at first, and there are techniques to obfuscate code like code minification − below is an example from running the search above on this source code.
For what is worth, virus total doesn't report any errors:

You can get something very similar to what you're trying to achieve using a bookmarklet instead of a web extension. A bookmarklet is just like a bookmark, except that instead of opening a new page, it will run some JavaScript code.
This is an example of how it could be implemented; nothing fancy, but it works:

const selection = window.getSelection().toString();
const searchEngine = 'https://qwant.com?q=define: ';
const queryStr = decodeURIComponent(selection.replace(/\s+/, '+'));
window.open(searchEngine + queryStr, '_blank').focus();

This will open a new tab and search for "define: " using Qwant as a search engine. To use it, create a new bookmark use this code in the URL field, but it has to be wrapped in a special syntax:

javascript:(() => { })();
                   ^
         code goes in between this curly braces

NOTE: Discourse has ligatures enabled to display certain combination of characters a little prettier. The arrow above (=>) is actually a combination of = and >.

sc2

1 Like

Thanks! Great answer.
1.) Although I do want to point out that the extension and code did come from Google webstore, and they scan their extension code regularly.

2.) Where does the file or folder path go in this command chain?
grep -Poi 'https?[^\s]*[^.,;]' *.{html,css,js}

3.) I also want to thank you for showing me how to use bookmarklet/javascript code, I've used similar javascript before, but never in this way, and it worked very well. :slight_smile:

1 Like

I'm sure Google does some scanning but malicious code has found its way to the end user before. But in any case, there is an important trust factor when it comes to software, as one couldn't possibly inspect every line of code of every program.
I assume that since you are interested in finding out which domains are being contacted, if any, that you have some doubts about it.

The last part of the code uses shell expansion: * means all files (in the current directory), followed by a dot (.) and then all combinations of "css", "html" and "js". If you are not in the same directory as the source code when running this, you just need to adjust it:

grep -Poi 'https?[^\s]*[^.,;]' <path/to/source/code>/*.{html,css,js}
1 Like

@zenzen , I dont want it to search only html, css and js files. I want it to search all files in the folder. How do I do that?
Is this correct?:
grep -Poi 'https?[^\s]*[^.,;]'

If you want all files, just add a star at the end, I only used html, css and js file extensions as those are the ones found in a browser extension:

grep -Poi 'https?[^\s]*[^.,;]' ./*

This however has a couple of caveats regarding folders. One is that grep will treat folders as regular files and show a little error message. The error message is harmless and can be safely ignored, but it also means that grep won't know to look inside of those folders to examine any files within them.

If this is not an issue, then you can run the command above as is. You can even modify it a bit to ignore the errors so that it doesn't distract you from any output
that you actually care about:

grep -Poi 'https?[^\s]*[^.,;]' ./* 2>/dev/null

To look for all files in the current folder as well as all files inside any other folders inside of it recursively, you can use the -r flag and not specify any location (but you have to navigate to the target folder you want to search through upfront):

grep -Proi 'https?[^\s]*[^.,;]'

EDIT: Below is my original solution using find but grep also has its own way of searching through a folder tree on its own. I only realize after I wrote it and looked it up to confirm it works. I'll leave the rest here in case it does help anyway as it's just another way of doing things.

But if you want to look not only in the current folder but also inside all folders inside all the way down, you'll have to combine it with the find command:

find . -type f -exec grep -Poi 'https?[^\s]*[^.,;]' {} \;

It will take a little longer depending on how many files it has to look through inside the current directory. If you want to know which files contain any matches, run this command again but change grep Poi with grep Pli.

1 Like

@zenzen ,Oh cool! thanks! very helpful.

html, css and js file extensions as those are the ones found in a browser extension

Also, there are .txt files and multiple .json files in .crx files too and possible other file types sometimes/rarely. :slight_smile:

grep will treat folders as regular files and show a little error message.

I get no error messages even when i search folders even without using "2>/dev/null". What does a grep folder error message look like?

-Thanks!

It should say something like "grep: ./Documents: Is a directory" (Documents in this case being one of the folders that it found in my current directory). But I was only mentioning this in case it did, if it didn't then all the better.

I would be surprised is there was a strange URL in plain sight inside a text file that is meant to be human-readable, like json. But yes, check everything for good measure.

1 Like

I did grep -Proi 'https?[^\s]*[^.,;]'

in a folder that had 11 other folders in it and I never got a "grep: ./Documents: Is a directory" :slight_smile:

Correct, that's what the r in -Proi is for (recursive search).

1 Like

1.) I thought you have to add 2>/dev/null to it you said to not get a folder error?

2.) How would someone use minification or obfuscation to circumvent & evade our command chain search we created? and create a backdoor.

Here is someones suggestion:

"i don't really know much about extensions or what are you trying to do but the commands look ok for https/http and ipv4, but for ipv6 (used ChatGpt to generate it) use:

grep -Proi '([0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}|(::([0-9a-fA-F]{1,4}:){0,6}[0-9a-fA-F]{1,4})|([0-9a-fA-F]{1,4}:){1,7}:'

I didn't want this to get confusing but I also didn't want to just show one way of doing things (as I had already written a reply using find). In short, there are usually multiple ways of doing the same thing all with its own pros and cons.

By default grep reads from files, and shows an error when it encounters a folder instead. You can provide the recursive flag with either -r or --recursive to change the behavior when it encounters a folder, and instead read all those files within it (including sub-folders, repeating the process recursively as needed).

You will have to check with an expert about this one.

1 Like

Thanks! Here's some more information I've gathered from research.

Encrypted malware:
Base64 and XOR encoding are the most popular.
Look for btoa() or atob() in javascript for base64.
For XOR you look for eval() and the use of the bitwise operator like '^''

"Unless it's a very amateur malicious code, you are unlikely to find a pure string of an IP address or http or something of the sort. Most well crafted exploits that phone home will create the address from multiple variables, sometimes even bit shifting to get specific characters, then concatenating those at different steps."

"The commands look ok for https/http and ipv4, but for ipv6 (used ChatGpt to generate it) use:
grep -Proi '([0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}|(::([0-9a-fA-F]{1,4}:){0,6}[0-9a-fA-F]{1,4})|([0-9a-fA-F]{1,4}:){1,7}:'